1. What is Mosaic ML
https://www.databricks.com/research/mosaic
https://www.linkedin.com/company/mosaicresearch/about/
At Databricks Mosaic AI, we believe that all organizations should have access to state-of-the art data + AI capabilities. The Mosaic Research team is continually evaluating methods to optimize the model development process - from algorithms to systems to hardware - so you can get more accurate insights, faster. Our rigorous science leads to real results.
2. Technology
1) DBRX is an open source, commercially usable LLM developed by our team at Databricks and released in March 2024. As of its release, it is the highest-quality open source model available. Thanks to its sparse mixture-of-expert architecture, it is also fast, fitting these extraordinary capabilites into just 36B active parameters.
2) Mosaic Diffusion is a generative model that turns text description into images, designed to be highly efficient.
https://github.com/mosaicml/diffusion
3) Mosaic BERT pertain your own BERT mdel on your data from scratch using Mosaic AI for $20.
https://github.com/mosaicml/examples/tree/main/examples/benchmarks/bert
4) MPT models are a family of open source, commercially usable LLMs released in summer 2023. They include MPT-30B (prioritizing quality) and MPT-7B (prioritizing efficiency). You can download versions of these models that we have trained or you can train your own MPT models on your data using the Mosaic AI Multi-Cloud Training (MCT) product.
5) Composer is an open source deep-learing training library optimized for scalability and usability.
https://github.com/mosaicml/composer
6) LLM Foudry is a highly efficient, open source codebase for training, fine-tuning and evaluating LLMs.
https://github.com/mosaicml/llm-foundry
7) Performance : Our deep learning stack is the most efficient for training, fine-tuning and deploying large models at scale.
https://www.databricks.com/blog/llm-inference-performance-engineering-best-practices
8) StramingDataset is an open source PyTorch DataLoader that makes it easy and efficient to stream training datasets.
https://github.com/mosaicml/streaming
9) Evaluation Gauntlet is a library for evaluating the quality of generative language models.
https://www.databricks.com/research/mosaic
3. Team
Mosaic Research has a proven record of making breakthroughs in generative AI and LLMs. Now we're looking for researchers and engineers who want to make an impact. If you're truth-seeking, data-driven and work from first principles, join us.
4. News
24.04.23
https://www.epnc.co.kr/news/articleView.html?idxno=300537
▶ 그록, 라마2 성능 능가하는 DBRX
데이터브릭스는 범용 대형언어모델(LLM) ‘DBRX’의 경쟁력을 소개하기도 했다.
DBRX는 데이터브릭스가 인수한 생성형 AI 플랫폼 ‘모자이크(Mosaic) ML’과 협력해 지난 3월 선보인 LLM이다. 데이터브릭스에 따르면 언어 이해(MMLU), 프로그래밍(HumanEval), 수학(GSM8K) 등 영역에서 그록, 라마2, 믹스트랄 같은 오픈소스 모델보다 더 나은 성능을 보인다.
마이어 부사장은 “DBRX는 경쟁 모델과 비교했을 때 가장 높은 수준의 생산 품질을 자랑한다”며 “어떤 것을 생성하든 간에 완전한 모델 가중치와 데이터 소유가 가능하다”고 설명했다.
'AI Company' 카테고리의 다른 글
Naver Future AI Center (0) | 2024.05.09 |
---|---|
Ex-Google Startup Top 10 (0) | 2024.05.02 |
VAIV Company - 사람과 공간에 대한 이해를 바탕으로 당신의 더 나은 선택을 돕는 바이브의 AI (0) | 2024.05.01 |
LLMOps.Space : a global community for LLM practitioners. (0) | 2024.05.01 |
Lilys AI - 정보 소화 과정을 소프트웨어로 재현해,인간 두뇌를 100배 확장합니다 (0) | 2024.05.01 |