본문 바로가기

AI Company

Mosaic ML - State-of-the-art generative AI model development and make data + AI available to all.

반응형

 

 

1. What is Mosaic ML

https://www.databricks.com/research/mosaic

 

Mosaic Research Hub

The latest research, blogs and breakthroughs from Mosaic Research — plus job openings and more

www.databricks.com

 

 

https://www.linkedin.com/company/mosaicresearch/about/

 

LinkedIn 로그인 또는 회원 가입

1B 회원들이 함께 하는 글로벌 비즈니스 세상 비즈니스 인맥을 쌓고 넓히세요. 커리어 계발에 유용한 정보와 기회의 문으로 들어오세요.

www.linkedin.com

At Databricks Mosaic AI, we believe that all organizations should have access to state-of-the art data + AI capabilities. The Mosaic Research team is continually evaluating methods to optimize the model development process - from algorithms to systems to hardware - so you can get more accurate insights, faster. Our rigorous science leads to real results.

 

 

2. Technology

1) DBRX is an open source, commercially usable LLM developed by our team at Databricks and released in March 2024. As of its release, it is the highest-quality open source model available. Thanks to its sparse mixture-of-expert architecture, it is also fast, fitting these extraordinary capabilites into just 36B active parameters.  

 

2) Mosaic Diffusion is a generative model that turns text description into images, designed to be highly efficient. 

https://github.com/mosaicml/diffusion

 

3) Mosaic BERT pertain your own BERT mdel on your data from scratch using Mosaic AI for $20. 

https://github.com/mosaicml/examples/tree/main/examples/benchmarks/bert

 

4) MPT models are a family of open source, commercially usable LLMs released in summer 2023. They include MPT-30B (prioritizing quality) and MPT-7B (prioritizing efficiency). You can download versions of these models that we have trained or you can train your own MPT models on your data using the Mosaic AI Multi-Cloud Training (MCT) product. 

 

5) Composer is an open source deep-learing training library optimized for scalability and usability. 

https://github.com/mosaicml/composer

 

6) LLM Foudry is a highly efficient, open source codebase for training, fine-tuning and evaluating LLMs. 

https://github.com/mosaicml/llm-foundry

 

7) Performance : Our deep learning stack is the most efficient for training, fine-tuning and deploying large models at scale.

https://www.databricks.com/blog/llm-inference-performance-engineering-best-practices

 

8) StramingDataset is an open source PyTorch DataLoader that makes it easy and efficient to stream training datasets.

https://github.com/mosaicml/streaming

 

9) Evaluation Gauntlet is a library for evaluating the quality of generative language models. 

https://www.databricks.com/research/mosaic

 

3. Team 

Mosaic Research has a proven record of making breakthroughs in generative AI and LLMs. Now we're looking for researchers and engineers who want to make an impact. If you're truth-seeking, data-driven and work from first principles, join us. 

 

 

 

4. News

24.04.23 

https://www.epnc.co.kr/news/articleView.html?idxno=300537

 

“모든 기업은 AI 기업”…데이터브릭스가 강조한 ‘데이터 인텔리전스’

[테크월드뉴스=양승갑 기자] “모든 기업은 데이터 및 인공지능(AI) 기업이다.”23일 데이터 및 AI 전문기업 데이터브릭스의 관계자들은 “성공하는 모든 기업은 데이터와 AI를 사용하고 있다”며

www.epnc.co.kr

▶ 그록, 라마2 성능 능가하는 DBRX

데이터브릭스는 범용 대형언어모델(LLM) ‘DBRX’의 경쟁력을 소개하기도 했다.

DBRX는 데이터브릭스가 인수한 생성형 AI 플랫폼 ‘모자이크(Mosaic) ML’과 협력해 지난 3월 선보인 LLM이다. 데이터브릭스에 따르면 언어 이해(MMLU), 프로그래밍(HumanEval), 수학(GSM8K) 등 영역에서 그록, 라마2, 믹스트랄 같은 오픈소스 모델보다 더 나은 성능을 보인다.

마이어 부사장은 “DBRX는 경쟁 모델과 비교했을 때 가장 높은 수준의 생산 품질을 자랑한다”며 “어떤 것을 생성하든 간에 완전한 모델 가중치와 데이터 소유가 가능하다”고 설명했다.

 

반응형
LIST