Make the README more informative with a focus on the project itself#347
Make the README more informative with a focus on the project itself#347Ingvarstep wants to merge 9 commits intomainfrom
Conversation
There was a problem hiding this comment.
Sure, I can give a review 🤗
Overall some nice extensions I think, and a pretty logical structure, i.e.: title, tags, relevant banner, one paragraph, and then straight into benefits followed by a very solid quick start. I would probably clean up the em dashes though. In my eyes, they read as low-effort, when clearly a lot of love has gone into this project.
Edit: I would more clearly link to the existing models, e.g.: https://huggingface.co/models?library=gliner
I realise now that I miss that a bit.
There was a problem hiding this comment.
It looks minimalistic, but it was actually hard to find the right prompt to generate it that way ;)
| - [Example Notebooks](https://github.com/urchade/GLiNER/tree/main/examples) | ||
| - Finetune on Colab [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/drive/1HNKd74cmfS9tGvWrKeIjSxBt01QQS7bq?usp=sharing) | ||
| ## 🛠 Installation & Usage | ||
| --- |
There was a problem hiding this comment.
This is very minor, but I think the --- lines are not needed when you also have a new header starting below each
There was a problem hiding this comment.
Actually, I looked into it a bit more, and we can definitely remove the --- lines when there is a new header after them.
| ## 🛠 Installation & Usage | ||
| --- | ||
|
|
||
| ## Quick Start |
There was a problem hiding this comment.
I like the updated README here, very bite-sized. All that's missing (I think) is a link to docs below each (sub)section, e.g. below the Output:, the Quanitzation and Compilation, the Serving, etc.). You already have some links below Training, that's good.
| Or apply after loading: | ||
| ## Serving | ||
|
|
||
| GLiNER provides a built-in serving interface for batch inference: |
There was a problem hiding this comment.
I would add 1 extra sentence to explain why someone would serve instead of using the script above.
There was a problem hiding this comment.
Btw, is this merged? I see #346 which seems to refer to this
There was a problem hiding this comment.
It's still under review, but API and documentation should be stable.
| quantize=True, | ||
| compile_torch_model=True, | ||
| ) | ||
| ``` |
There was a problem hiding this comment.
I'm missing the quantization/compilation gains a bit. As a user I'd like to know what this does for me.
|
|
||
| --- | ||
|
|
||
| <div align="center"> |
There was a problem hiding this comment.
Not very crucial in my opinion, but it's fine to keep.
| - `quantize="bf16"` — bfloat16. Better numerical stability, slightly less speedup (~1.2x). | ||
| - `quantize="int8"` — int8 quantization. On CPU, uses built-in FBGEMM int8 kernels (~1.6x speedup). On GPU, uses [torchao](https://github.com/pytorch/ao) int8 weight-only quantization (~50% memory reduction, no speed gain). Intended for models fine-tuned with quantization-aware training (QAT). Stock DeBERTa-based models lose accuracy with int8. | ||
| - On CPU, fp16/bf16 quantization reduces memory usage but does not improve speed. | ||
| | Architecture | Description | |
There was a problem hiding this comment.
Nice to see this, I honestly lost track and I've been following GLiNER-related developments pretty close
| | **Uni-encoder** | Strong zero-shot capabilities, supports up to ~50 entity types. The original GLiNER architecture. | | ||
| | **Bi-encoder** | Scalable to massive numbers of entity types via separate text and label encoding. | | ||
| | **RelEx** | Joint NER and relation extraction in a single model. | | ||
| | **GLiNER Decoder** | Hybrid architecture for open NER — entity types are generated with a small decoder for maximum flexibility. | |
There was a problem hiding this comment.
| | **GLiNER Decoder** | Hybrid architecture for open NER — entity types are generated with a small decoder for maximum flexibility. | | |
| | **GLiNER Decoder** | Hybrid architecture for open NER: entity types are generated with a small decoder for maximum flexibility. | |
Personal preference: I would remove the em dashes throughout
|
|
||
| --- | ||
|
|
||
| ## Popular Use Cases |
There was a problem hiding this comment.
Even better if you can immediately link from a use case to a matching example like these: https://urchade.github.io/GLiNER/usage.html#example-1-extract-company-information
There was a problem hiding this comment.
This is a good point, I am adding cross-links to the documentation with real examples.
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
|
Hi @urchade , any thoughts on the readme update? |
Hi @urchade ,
Because the current README highlights other commercial tools and projects, it creates the following issues:
I have crafted a new README proposal to address the issues described above and to make it more informative and better capture the GLiNER project itself.
To highlight GLiNER2 and other amazing projects from the Ecosystem, I created a dedicated section listing them.
Looking forward to your suggestions and feedback.
@tomaarsen, it would be awesome to get your review and feedback as well.
Best,
Ihor