# Just Created a Tree-sitter Parser: Now What?

The authors of [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) have provided detailed [documentation](https://tree-sitter.github.io/tree-sitter/creating-parsers/index.html) on creating your own language parsers. So, following the tutorial, you have meticulously crafted your own grammar using `tree-sitter-cli`. You ran `tree-sitter generate` as the docs instructed and this generated the C code required to parse your language. You wrote a simple code snippet and tested parsing it with `tree-sitter parse path/to/your/code`. The result matches perfectly with your expectations. Everything is going under control.

Now it's time for the next step. You can't wait to integrate the parser into your project written in Python/C++/Java/... Tree-sitter provides bindings for these languages, making it easy to work with, and a list of known parsers is readily available for you to explore and use. But what about custom parsers? You go back to the documentation, and it looks like the story ends when the parser is created. Wait, what?

Actually, you're not alone in this. An [issue](https://github.com/tree-sitter/tree-sitter/issues/643) on Tree-sitter's GitHub repository raised a similar question, but I think they were overcomplicating things. Let's keep it simple and see what the CLI has generated for us:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1763350059565/5007666d-f356-42f6-9053-d7235e40bce9.png align="left")

It looks like the `bindings` folder contains bindings for our parser in different languages. However, when we look closer at the subfolders, these bindings are just empty frameworks without the actual parser. In the root directory of the parser project, we also find some files like `setup.py`. Since I'm working with Python, I opened the `setup.py` script and found these lines in the `setup` section:

```python
setup(
    packages=find_packages("bindings/python"),
    package_dir={"": "bindings/python"},
    package_data={
        "tree_sitter_imp": ["*.pyi", "py.typed"],
        "tree_sitter_imp.queries": ["*.scm"],
    },
    ext_package="tree_sitter_imp",
    ext_modules=[
        Extension(
            name="_binding",
            sources=[
                "bindings/python/tree_sitter_imp/binding.c",
                "src/parser.c",
            ],
            define_macros=[
                ("PY_SSIZE_T_CLEAN", None),
                ("TREE_SITTER_HIDE_SYMBOLS", None),
            ],
            include_dirs=["src"],
            py_limited_api=not get_config_var("Py_GIL_DISABLED"),
        )
    ],
    cmdclass={
        "build": Build,
        "build_ext": BuildExt,
        "bdist_wheel": BdistWheel,
        "egg_info": EggInfo,
    },
    zip_safe=False
)
```

Isn't this exactly what we need to combine the parser with the Python language binding? Just like in the setup scripts of general packages, the `build_ext` command is used to compile C/C++ extension modules for the package. It's an important part of the setuptools build process and is often used with the `--inplace` option. So, I set up the Python virtual environment and ran `python setup.py build_ext --inplace`. And there we have it! A dynamic library file named `_binding.abi3.so` has been built and copied into the Python language binding folder.

Now, let's test the functionality of the package. In the `bindings/python/tests` folder, there's a simple test script called `test_binding.py`. We install all the dependencies and run it, and BAM! It works. From now on, my `tree_sitter_imp` package folder can be copied to my project and I just need to build a parser like this:

```python
from tree_sitter import Language, Parser
import tree_sitter_imp

my_parser = Parser(Language(tree_sitter_imp.language()))
```

And, that's it. Have fun with your custom parser ;-)

## …Yet Another Hack

In fact, my initial solution for this issue was veeeeeery different.

Remember the `generate` command creates the C code for the parser, right? This means we can always build the parser as a library. Tree-sitter provides [a tool for this](https://tree-sitter.github.io/tree-sitter/cli/build.html). Using the command `tree-sitter build`, a dynamic library is built in the root folder of the parser project. I'm working on macOS, so it's named `imp.dylib`. If you're on Linux or Windows, the name will end with `.so` or `.dll`.

To build a `Parser` object, you need to pass a `Language` object to the initializer, and for custom language parsers, the initializer for `Language` requires a pointer for the language function, like `tree_sitter_my_lang()`. We only need to load the dynamic library into our project, and get handle of the function. How do we deal with dynamic libraries? Different answers for different languages. In Python, we use the `ctypes` package. And this is how I create a wrapper for the library:

```python
import ctypes
from tree_sitter import Language, Parser

LANG_NAME = 'imp'
LIB_PATH = 'lib/imp.dylib'

def imp_parser() -> Parser:
    lib = ctypes.CDLL(LIB_PATH)
    lang_func = getattr(lib, 'tree_sitter_imp')
    lang_func.restype = ctypes.c_void_p
    return Parser(Language(lang_func()))
```

You just need to put the library under the subfolder `lib` and import the wrapper, and then calling `imp_parser()` would directly get you a parser of the language. Again, have fun! But I suggest using the first approach. Much more elegant, isn't it? 🤪
