Skip to content

dvolkow/mystemex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mystemex

The Elixir wrapper of the Yandex Mystem 3 morphological analyzer for Russian language.

A Quick Example

Lemmatization:

iex(1)> text = "Красивая мама красиво мыла раму"
"Красивая мама красиво мыла раму"
iex(2)> {:ok, lemmas} = Mystemex.lemmatize(text)
{:ok,
["красивый", "мама", "красиво", "мыть", "рама"]}

Getting grammatical information and lemmas:

iex(4)> {:ok, analyze} = Mystemex.analyze(text)
{:ok,
  [
    %{
      "analysis" => [
        %{
          "gr" => "A=им,ед,полн,жен",
          "lex" => "красивый",
          "wt" => 1
        }
      ],
      "text" => "Красивая"
    },
    %{
      "analysis" => [
        %{"gr" => "S,жен,од=им,ед", "lex" => "мама", "wt" => 1}
      ],
      "text" => "мама"
    },
    %{
      "analysis" => [
        %{"gr" => "ADV=", "lex" => "красиво", "wt" => 0.8149252476}
      ],
      "text" => "красиво"
    },
    %{
      "analysis" => [
        %{
          "gr" => "V,несов,пе=прош,ед,изъяв,жен",
          "lex" => "мыть",
          "wt" => 0.441520999
        }
      ],
      "text" => "мыла"
    },
    %{
      "analysis" => [
        %{
          "gr" => "S,жен,неод=вин,ед",
          "lex" => "рама",
          "wt" => 0.9993591156
        }
      ],
      "text" => "раму"
    }
]}

You can find usage examples in test directory.

Types

Return value types are described in Mystemex.Types

Settings

See config/configs.exs.

mystem_path is path to your Mystem binaries;

pool_size is size of workers pool;

pool_max_overflow is max overflow for pool size.

Installation

  1. Install mystem binary package to your system. Download from here: https://yandex.ru/dev/mystem/

  2. If available in Hex, the package can be installed by adding mystemex to your list of dependencies in mix.exs:

def deps do
  [
    {:mystemex, "~> 0.2.1"}
  ]
end

or build from this source:

mix deps.get
mix compile
  1. Setup Mystem binaries path (:mystemex, :mystem_path in your config).

⚠️ License Notice

This library is MIT-licensed, but Mystem itself is proprietary Yandex software. You must download it separately and accept its license.

About

The Elixir wrapper of the Yandex Mystem 3 morphological analyzer for Russian language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages