ClickHouse acquires Langfuse

219 points by tin7in 4 days ago | 96 comments

kmlx 4 days ago |
maybe clickhouse can finally make sense of the langfuse documentation
tuananh 4 days ago |
how does it benefit for clickhouse?
ushakov 4 days ago |
they want to enter the llm observability market and langfuse has already built a convenient wrapper around clickhouse that companies have adopted
https://clickhouse.com/blog/clickhouse-raises-400-million-se...
tuananh 4 days ago |
thank you! i missed that news
LunaSea 3 days ago |
Makes no sense to me, there's no synergy between LLMs and databases besides the vector search features.
mercurialsolo 4 days ago |
Clickhouse needs observability models to be more useful to agent run infra
ponywombat 4 days ago |
Ah, the painful migration to Clickhouse from v2 to v3 makes sense now https://langfuse.com/self-hosting/upgrade/upgrade-guides/upg...
__s 4 days ago |
This is not why v3 made those changes over a year ago, your cause & effect are mistaken
bezbac 4 days ago |
Congratulations to everyone involved, quite remarkable considering Langfuse was only founded as part of YC 23.
shmichael 4 days ago |
Without the purchase price, it is unclear whether this deserves congratulations or condolences.
Two years in the LLM race will have definitely depleted their seed raise of $4m from 2023, and with no news of additional funds raised it's more than likely this was a fire sale.
stuartjohnson12 4 days ago |
Anecdotally, from the AI startup scene in London, I do not know folks who swear by Langfuse. Honestly, evals platforms are still only just starting to catch on. I haven't used any tracing/monitoring tools for LLMs that made me feel like, say, Honeycomb does.
7thpower 4 days ago |
I love langfuse, it is my goto.
topicseed 4 days ago |
I'd say out of many generative AI observability platforms, Langsmith and Weave (Weights&Biases) are probably the ones most enterprises use, but there's definitely space for Langfuse, Modelmetry, Arize AI, and other players.
djhn 3 days ago |
While I get what you’re saying, “most enterprises” barely use gen AI in any meaningful sense, and AI observability is an even smaller niche technology.
jascha_eng 4 days ago |
It was not a fire sale I'm pretty sure. Langfuse has been consistently growing, they publish some stats about sdk usage etc so you can look that up.
They also say in the announcement that they had a term sheet for a good series a.
I think the team just took the chance to exit early before the llm hype crashes down. There is also a question of how big this market really is they mostly do observability for chatbots but there are only so many of those and with other players like openais tracing, pydantic logfire, posthog etc they become more a feature than a product of its own. Without a great distribution system they would eventually fall behind I think.
2 years to a decent exit (probably 100m cash out or so with a good chunk being Clickhouse shares) seems like a good idea rather than betting on that story to continue forever.
axpy906 4 days ago |
I don’t know about that. I looked at them a couple of months back for prompt management and they were pretty behind in terms of features. Went with PromptLayer
swyx 3 days ago |
say more about what you were looking for that promptlayer had/langfuse didnt?
axpy906 3 days ago |
Ohh. Reply from Shawn. I love your work. As far as I recall, we were not looking for a net of features but specifically a git like API that could manage and version the prompts. Meta data tagging, Jinja2, release labels and easy rollback. Add that up with Rest, Typescripts and Python support and it worked pretty well. Langfuse seemed way better at tracing though.
swyx 2 days ago |
hi! :) thanks for supporting my work. ok yea that makes sense - i was a langfuse only user but sounds like i might want to check out promptlayer (tbh i was until recently of the opinion that you should always check in your prompts into git... flipped recently)
floriferous 3 days ago |
Agreed, Sentry, Posthog, and many more are all doing the exact same thing now, I'd be surprised if this was a good deal for Langfuse. I personally migrated away from it to use Sentry, their software was honestly not that great.
dangoodmanUT 4 days ago |
The fact that all metrics are relative doesn't suggest they got an amazing deal
mritchie712 4 days ago |
the "Prompt Management" part of these products always seemed odd. Does anyone use it? Why?
dandelionv1bes 4 days ago |
I do understand why it’s a product - it feels a bit like what databricks has with model artifacts. Ie having a repo of prompts so you can track performance changes against is good. Especially if say you have users other than engineers touching them (ie product manager wants to AB).
Having said that, I struggled a lot with actually implementing langfuse due to numerous bugs/confusing AI driven documentation. So I’m amazed that it’s being bought to be really frank. I was just on the free version in order to look at it and make a broader recommendation, I wasn’t particularly impressed. Mileage may vary though, perhaps it’s a me issue.
alexpadula 4 days ago |
I thought the docs were pretty good just going through them to see what the product was. For me I just don't see the use-case but I'm not well versed in their industry.
dandelionv1bes 4 days ago |
I think the docs are great to read, but implementing was a completely different story for me, ie, the Ask AI recommended solution for implementing Claude just didn’t work for me.
They do have GitHub discussions where you can raise things, but I also encountered some issues with installation that just made me want to roll the dice on another provider.
They do have a new release coming in a few weeks so I’ll try it again then for sure.
Edit: I think I’m coming across as negative and do want to recommend that it is worth trying out langfuse for sure if you’re looking at observability!
pprotas 4 days ago |
Iterating on LLM agents involves testing on production(-like) data. The most accurate way to see whether your agent is performing well is to see it working on production.
You want to see the best results you can get from a prompt, so you use features like prompt management an A/B testing to see what version of your prompt performs better (i.e. is fit to the model you are using) on production.
cunha00 4 days ago |
We use it for our internal doc analysis tool. We can easily extract production genrrations, save them to datasets and test edge cases. Also, it allows prompt separation in folders. With this, we have a pipeline for doc abalysis where we have default prompts and the user can set custom prompts for a part of of the pipeline. Execution checks for a user prompt before inference, if not, uses default prompt, which is already cached on code. We plan to evaluate user prompts to see which may perform better and use them to improve default prompt.
swyx 4 days ago |
(congrats team! such a joy to see you succeed)
every single day there is an acquisition on here. what's going on in the macro?
marcklingen 4 days ago |
(thank you!)
Nora23 4 days ago |
Does this mean Langfuse will now have better ClickHouse integration?
jimmySixDOF 4 days ago |
I predict it will be Pydantic next to get picked up by someone for logfire and agent framework.... fine as long as all these open source projects stay open source then good for them
CuriouslyC 4 days ago |
The Pydantic stuff is nice, but in a minimalist way that I don't see being amenable to SaaS/vc/etc.
scolvin 4 days ago |
We have a SaaS platform (Pydantic Logfire - General and AI observability), and we raised our Series A from Sequoia.
For good or bad, I think we're pretty "SaaS/vc/etc." already.
saberience 3 days ago |
Yeah it seems a bit forced though. Like why go from an open source utility library for typing to take VC money and try to shoe-horn in an observability platform that no one was asking for?
My prediction, not going to be a good investment.
scolvin 3 days ago |
ROFL.
I clearly didn't build pydantic "to take VC money" - I maintained Pydantic for 5 years before deciding to take VC money.
I wanted to build logfire and I didn't want to compete with or restrict our open source, we so built logfire.
Let's see on the investment; seems to be going pretty well so far.
gyre007 4 days ago |
> Our goal continues to be building the best LLM engineering platform
Interesting headline for a checks notes time series database company.
stingraycharles 4 days ago |
That’s what you get when you raise a lot of VC capital. Just being the best timeseries database is not enough.
michaelmior 4 days ago |
Note that the headline is from Langfuse, not ClickHouse. Reading the announcement from ClickHouse[0], the headline is "ClickHouse welcomes Langfuse: The future of open-source LLM observability". I think the Langfuse team is suggesting that they will be continuing to do the same work within ClickHouse, not that the entire ClickHouse organization has a goal of building the best LLM engineering platform.
[0] https://clickhouse.com/blog/clickhouse-acquires-langfuse-ope...
cs554 4 days ago |
"Berkshire Hathaway Inc. is an American multinational conglomerate holding company" is a weird thing for a textile manufacturer to call itself. Almost like...businesses expand and evolve?
(they've never been a time series database company either lol)
wodenokoto 4 days ago |
Language models are time series models.
It’s great when you get this insight as a student of NLP, because suddenly your toolset grows quite a bit.
Jgrubb 4 days ago |
Could you elaborate? because that sentence made my brow wrinkle with confusion. I have thought to myself before that all business data problems eventually become time series problems. I'd like to understand your point of view on how LLMs fit into that.
wodenokoto 3 days ago |
Time series just means that the order of features matter. Feature 1 occurs before feature 2.
E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.
With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.
Text can be modeled as a time series.
A language model tells you the next character/token/word depending on the previous input.
Language models are time series.
It’s not an audacious claim.
Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?
LunaSea 3 days ago |
As a data structure it is an ordered list of integers but no LLM needs to accès it in a database, it's way to slow for anything serious.
RAG and vector Approximate Nearest Neighbour (ANN) is the the go to use case.
thesz 3 days ago |
[1] https://towardsdatascience.com/llm-powered-time-series-analy...
[2] https://arxiv.org/abs/2506.02389
[3] https://arxiv.org/html/2402.10835v3
Some links from the top of Google search.
Take a look here, also, it's an important law: https://en.wikipedia.org/wiki/Benford%27s_law
It is possible for LLMs to learn Bernford's law, implicitly. So they will be non-null predictors of time series data, because time series data is also Bernford-law-distributed [4].
[4] https://ui.adsabs.harvard.edu/abs/2017EGUGA..19.2950T/abstra...
dangoodmanUT 4 days ago |
Your notes aren't very good. They're not a time series database company, they're a columnar database company. But yeah the LLM bit is weird, database companies _always_ feel like charlatans when it comes to LLMs.
vegabook 4 days ago |
Willing to bet most columnar stores are used for time series.
domoritz 4 days ago |
I suspect most use of columnar databases is OLAP, which is different from what people usually mean when they say time series data.
goodkiwi 3 days ago |
I’d take that bet
hodgesrm 3 days ago |
ClickHouse effectively has a number of personas. Time series is one of them, and ClickHouse has steadily absorbed market share from pure play time series databases over the last few years. Other personas include real-time observability backend (the single biggest use case in my experience) as well as real-time data lake engine. Time series support, column storage, and real-time response are key underlying capabilities. It's quite versatile and fun to use.
Disclosure: I run Altinity, a vendor in this space.
(Update: Disclaimer -> Disclosure. Sigh.)
swyx 3 days ago |
Altinity isn't just a vendor, it's THE ClickHouse vendor before ClickHouse became a company. https://altinity.com/blog/big-news-in-the-clickhouse-communi...
always nice to see a database ceo be "one of us" and/or "write like a real human being".
hodgesrm 3 days ago |
That's very kind of you. We love working on ClickHouse and real-time analytics.
vibedev 4 days ago |
But this is correct? The article that you read is from Langfuse POV, not Clickhouse.
mrits 4 days ago |
They are closer to an LLM database than a time series database. But they aren't very close to either.
madduci 3 days ago |
Diversification is the keyword
rr808 4 days ago |
Just did a funding round. In a sign of the times clickhouse used to be an interesting DB product, but is now a "database software that companies can use as they develop AI agents "
<i>Database technology startup ClickHouse Inc. has raised $400 million in a new funding round that values the company at $15 billion — more than double its valuation less than a year ago. </i>
https://www.bloomberg.com/news/articles/2026-01-16/clickhous...
embedding-shape 4 days ago |
Investors are finicky creatures, if you've been relying on VC-funding since before, it's hard to stop until you are really successful, and if everyone starts to only look at shiny AI stuff and you still need investors, you end up with not much choice.
I wish there was less of it, we'd have better software then, but :/
esafak 4 days ago |
Would we? You can look at places with less funding and see how many software companies get off the ground.
embedding-shape 4 days ago |
> You can look at places with less funding
Yeah, like FOSS which is drastically underfunded since birth, yet continues to put out software that the entire world ends up relying on, instead of relying on whatever VC-pumped companies are putting out.
I'm not talking "better software" as in "made a lot of money", I meant "better" as in "had a better impact on the world".
esafak 4 days ago |
FOSS software is written by people working at companies that likely owe their existence to VC.
hrimfaxi 4 days ago |
What gave you that idea?
esafak 4 days ago |
Because Silicon Valley, which contributes the majority of the code, is venture backed. For example, 84% of the Linux kernel's development is corporate: https://commandlinux.com/statistics/linux-foundation-growth-...
I don't know why people are so upset here.
weiliddat 4 days ago |
That sounds like more sign of recent times.
FOSS software that many rely on that has been around for a while were non-VC: VCS, Linux / GNU / BSD, web browsers, various programming languages, various databases...
rhplus 3 days ago |
Sure, those projects were un(der)funded in the 80s and 90s but the reason we talk about them today is because of the huge amount of investment - both direct and in kind - that VC backed companies have managed to give to many of them.
I think it’s easy to forget how long ago it was when FOSS truly was the outsider and wouldn’t be touched by most companies.
Mozilla/Firefox started in 1998 and then started taking ad revenue from Google in 2005, which pays for a large chunk of its development. It’s been part of the Silicon Valley money machine for 20 years, most of its existence.
jedberg 3 days ago |
Many of your examples came from people who were funded by Universities in the 80s, which was basically the VC of the time. And in the 90s, a lot of the core committers of those projects were already working at VC funded companies.
Back then it was very normal to get VC funding and then hire the core committers of your most important open source software and pay them to keep working on it. I worked at Sendmail in the 90s and we had Sendmail committers (obviously) but also BSD core devs and linux core devs on staff. We also had IETF members on staff.
And we weren't unique, this happened a lot.
weiliddat 3 days ago |
Thanks for the insight and history. Glad to be corrected.
Was it in a different nature to current VC funded FOSS though? It sounds like their contributions to FOSS was tangential and not the sold product?
Maybe a bit more like Google and Chrome?
hodgesrm 3 days ago |
> FOSS software that many rely on that has been around for a while were non-VC: VCS, Linux / GNU / BSD, web browsers, various programming languages, various databases...
It's honestly hard to pick a pattern out for older open source project contributions. PostgreSQL started at UC Berkeley but people contributed to it from all over. Key engineers like Tom Lane worked a number of companies in the database field, some dependent on VC funding, some not. He's currently at Snowflake. [0] A lot of recent innovation around PostgreSQL today (Neon, Supabase, etc.) is VC funded.
That pattern changed with projects like Hadoop, which was about the time that VC funds recognized a standard playbook around monetizing open source. [1]
[0] https://en.wikipedia.org/wiki/Tom_Lane_(computer_scientist)
[1] https://en.wikipedia.org/wiki/Cloudera
TheTaytay 3 days ago |
I don’t know why you are being downvoted. I mean, I guess I do, but sheesh, they are really shooting the messenger here. Maybe they are looking for more nuance: a lot of software is/was written by people working at…
I don’t think everything VCs touch is gold, but it’s also not the case that they are pure evil either. It’s almost as if you can’t claim they are all good or all bad.
CodingJeebus 4 days ago |
I sometimes wonder if the VC ecosystem creates its own confirmation bias by making it easy to see and aggregate companies it incubates. Whenever I look for jobs, I'm always surprised to find companies that have taken no VC funding and don't try particularly hard to market to the industry as a whole, preferring instead to stay relatively under the radar.
They tend to have more grounded financials (read: paths to profitability) and while the pay packages aren't quite aligned with the top end of the market, they also tend to manage headcount more responsibly than FAANG. I work with a fairly niche stack and I'm constantly finding new companies that I've never heard of and don't raise VC rounds.
Long way of saying that just because they're not easy to find doesn't mean they don't exist.
shimman 3 days ago |
What but? If this is the "best" that VC can do with the money, the US government should simply tax it away from them. Absolutely worse way to allocate resources and develop a robust forward looking tech industry, you're just chasing the shiny while fucking over the commons.
embedding-shape 3 days ago |
Maybe my point wasn't clear, I agree with you. It's a bad way of allocating resources, and we'd had better software had we been without it.
steveBK123 4 days ago |
I think it’s hard to make money as a pure play DB vendor and has been for a decade or two. So they all inevitably pivot into some service specific to whatever the hot use case of the moment is… Cybersecurity. Observability. Crypto. AI.
debarshri 3 days ago |
The fundraising market is very interesting right now. You have to have some AI and agenr narrative without which you do not look very forward looking. You might be a database company with million in revenue but if you do not have a AI narrative you would not be perceived as forward looking as compared to a startup thats burning through millions in token with no path to profitability. It has become table stakes and the new reality for startups.
data-ottawa 3 days ago |
This has made buying products hard.
What do you do? “We power your agents” okay… but what do you do? How do you do that?
Every DB, storage system, and analytics tool website is like this lately.
nikcub 3 days ago |
Clickhouse are on a bit of a roll. acquired peerdb and are doing a hosted postgres product[0]
Acquired hyperdx[1] for their clickstack[2] observability platform and adding langfuse to a bunch of other llm related acquisitions and products
They're really building out a snowflake / databricks alternative
[0] https://clickhouse.com/cloud/postgres
[1] https://clickhouse.com/docs/cloud/manage/hyperdx
[2] https://clickhouse.com/use-cases/observability
7thpower 4 days ago |
Langfuse has been my favorite LLM observability solution so far. Hopefully this acquisition makes it better, not worse.
CuriouslyC 4 days ago |
As a big Clickhouse fan, agent evals are where their product really shines. They're buying into market segment where their product is succeeding so they can vertically integrate and tighten up the feedback loop.
dpkirchner 4 days ago |
Are you talking about this sort of thing? https://clickhouse.com/blog/tracing-openai-agents-clickstack
kankerlijer 4 days ago |
For those building applications with Langfuse and Clickhouse - do you like these products? I get the odd request to do an AI thing, and my previous experience with LLM wrappers convinced me to stay away from them (Langchain, Llamaindex, Autogen, others). In some cases they were poorly written, and in other ways the march of progress rendered their tooling irrelevant fairly quickly. Are these better?
embedding-shape 4 days ago |
The observability stuff can be nice for deployments but really, these libraries/frameworks don't actually do much more than provide some structure, which unless you're expecting a team with high turnover to maintain it, doesn't really matter all that much, especially if you're an experienced developer, you'll find better design/architectures fitting for your use case without them.
st3fan 4 days ago |
Hm I find this very much a "please reinvent the wheel" take.
These frameworks provide structure for established patterns,but they also actually do a lot that you don't have to do anymore. If you are for example building an agentic application then these kind of frameworks make it very simple to create the workflows, do the chat with the model providers, provide structure for agentic skills, decision making and the human in the loop, etc. etc.
All stuff that I would consider "low level". All things you don't have to build.
If you have an aversion to frameworks then sure - by all means. But if you like to move faster and using good building blocks then these frameworks really help.
One thing to keep in mind - many of these AI frameworks are open source and work really well without needing backend services. Or you can self host them where needed. But for many that is also the premium model, please use and pay for our backend services. But that is also a choice of course.
embedding-shape 4 days ago |
> All stuff that I would consider "low level". All things you don't have to build.
But those are also very trivial to build, and you end up having to customize them for your need, and if the framework don't have those levers, better be prepared to either fork the framework, or spend time contributing upstream.
Or, start simple yourself with what you need, use libraries for the hairy parts you don't want to be responsible for the implementation of, then pipe these things together. You'll get a less compromised experience, and you'll understand 100% how everything works, which is the part people generally try to avoid and that's why they're reaching for frameworks.
> But if you like to move faster and using good building blocks then these frameworks really help.
I find that they help a lot with the "move faster" part in the beginning, but after that period, they slow you down instead. But I'm also a person that favors "slow software design and development" where you take your time to nail down a good design/architecture before you run. Slow is fast, and avoiding hairballs is the most important part if you're aiming for "move fast for longer" rather than "a sprint of fast".
deaux 4 days ago |
Ive used Langfuse. It's completely unrelated to tools like Langchain and Autogen. It's just logging/tracing for LLMs. Sure they added stuff like "prompt management" and "epxeriments" etc. probably to keep investors happy but those are entirely optional sidedishes.
The tools you mentioned are indeed to be avoided. I trialed them early on and quickly realized in 99.9% they do nothing but bog you down. Pretty sure they'll be dead sooner rather than later.
agsqwe 3 days ago |
We like Langfuse for observability via OpenTelemetry. Prompt management is too basic for our needs.
marcklingen 3 days ago |
Let me know what features are missing in prompt management
amai 4 days ago |
What is the advantage of a specialized llm tracing solution like langfuse vs a complete tracing solution like logfire: https://pydantic.dev/logfire ?
axpy906 4 days ago |
SaaS company pivots to AI. Gets funding rebranded as AI company. Buys a company that actually knows it.
It’s still early but I question how much of these SaaS companies will continue. I’d rather connect Claude or whatever to do my task than have to learn a new platform let alone login to it.
antoniojtorres 4 days ago |
I don’t think that is a an accurate depiction of ClickHouse. I don’t think they’re pivoting from their main data warehousing product at all. Probably making their cloud offering more competitive with other providers.
axpy906 4 days ago |
I haven’t used their product so you’re probably right. I’m biased as an AI engineer because I get contacted to help implement AI in existing platforms. While I admire the pivot the reality is what they have is already quite behind. Anything I make these days is old in about three months… You’d ideally want to start fresh and not have to worry about codebase that is years old.
amai 4 days ago |
Since clickhouse is headquartered in the US that means the langfuse cloud is no longer GDPR compliant.
deaux 4 days ago |
Correct! Will be moving away immediately for this reason.
Or well, technically incorrect, as someone will surely point out. US companies can be legally compliant with GDPR, it's just that the likes of the CLOUD Act and FISA make it completely meaningless.
Before anyone comes in talking about how it's farfetched that those matter, it's 100x as far-fetched that self-hosted Chinese LLM models would exfiltrate your data (you can even airgap them) yet 90% of corporate America is avoiding them based solely on the country they were trained in. Compared to that insanity, above US acts are a very real threat.
And that's of course on top of that now an adversarial state's company has the power to immediately dissolve Langfuse.
LunaSea 3 days ago |
Isn't ClickHouse owned by Nebius in Amsterdam?
Rafert 4 days ago |
I'm surprised it's not mentioned yet, but this seems to compliment last year's acquisition of observability tool HyperDX[1] (part of ClickStack[2]) quite well. I'm in the market for a new o11y platform and it seems all vendors are working to add LLM observability one way or the other, if they haven't added it already.
1: https://news.ycombinator.com/item?id=44194082 2: https://clickhouse.com/use-cases/observability
gandreani 4 days ago |
What are you using now and what are you looking for in your new platform?
smithclay 4 days ago |
This is part of a bigger consolidation trend, AI hype or not: which general-purpose data vendor gets to store and query all of your observability and business data?
Snowflake acquired Observe last week, AWS made it easy in December to put logs from Cloudwatch in their managed iceberg catalog, and Azure is doing a bunch of interesting stuff with Fabric.
The line between your data lake/analytics vendor and observability vendor is getting blurry.
deaux 4 days ago |
Very sad, for all their marketing around EU, GDPR, privacy and so on. I feel dumb for having fell for it a little.
This is a big reason why there are so few EU tech startups, they get bought out if they're doing well, more and more consolidation in tech, more and more "exits".
jimmyl02 3 days ago |
Clickhouse's full announcement is here https://clickhouse.com/blog/clickhouse-raises-400-million-se... and I think another big piece is directly integrating postgres into their ecosystem.
It seems like an expansion play from their team and their end vision as both a platform (clickhouse + postgres) and product (observability) seems to be pretty good combo that fits hand in hand.