localfirst.fm

A podcast about local-first software development

Listen

Conference

#18 – James Arthur: ElectricSQL, read-path syncing, PGLite


The guest of this episode is James Arthur, founder and CEO of Electric SQL, a Postgres-centric sync engine for local-first apps. This conversation will dive deep into how Electric works and explore its design decisions such as read-path syncing and using HTTP as the network layer to improve scalability. Towards the end we are also covering PGLite, a new Postgres in WASM project by Electric. 

Mentioned in podcast


Links:

Thank you to PowerSync and Rocicorp for supporting the podcast.

Transcript

Intro
00:00I mean, another thing is like the operational characteristics of the
00:02system, for this type of sync technology.
00:05So comparing HTTP with WebSockets, like WebSockets are stateful, and
00:09you do just keep things in memory.
00:11If you look across most real time systems, They have scalability limits because
00:16you will come to the point where if you have, say, 10, 000 concurrent users,
00:19it's almost like the thing of don't have too many open Postgres connections.
00:22But if you're holding open 10, 000 WebSockets, you may be able to do the
00:26IO efficiently, but you will ultimately be sort of growing that kind of memory
00:30and you'll hit some sort of barrier.
00:31Whereas, with this approach, you can basically offload that
00:34concurrency to the CDN layer.
00:37Welcome to the localfirst.fm podcast.
00:39I'm your host, Johannes Schickling, and I'm a web developer, a
00:42startup founder, and love the craft of software engineering.
00:46For the past few years, I've been on a journey to build a modern, high quality
00:50music app using web technologies.
00:52And in doing so, I've been falling down the rabbit hole of local-first software.
00:56This podcast is your invitation to join me on that journey.
01:00In this episode, I'm speaking to James Arthur.
01:03Founder and CEO of Electric SQL, a Postgres centric sync
01:07engine for local-first apps.
01:09In this conversation, we dive deep into how Electric works and explore
01:14its design decisions, such as read path syncing and using HTTP as a
01:18network layer to improve scalability.
01:21Towards the end, we're also covering PGLite, a new project by Electric
01:26that brings Postgres to Wasm.
01:28Before getting started, a big thank you to Rocicorp and PowerSync
01:32for supporting this podcast.
01:34And now, my interview with James.
01:37Welcome James.
01:37So good to have you on the podcast.
01:39How are you doing?
01:40Great.
01:41Yeah, really good to be here.
01:42Thank you for having me on.
01:43So the two of us know each other for quite a while already.
01:47And to be transparent, the two of us have actually already had quite
01:51a couple of projects together.
01:53The one big one among them is the first Local-First Conference that we
01:57organized together this year in Berlin.
01:59That was a lot of fun.
02:00But for those in the audience who don't know who you are, would
02:05you mind introducing yourself?
02:07So, my name is James Arthur.
02:09I am the CEO and one of the co-founder of Electric SQL.
02:14So, Electric is a Postgres sync engine.
02:18We sync little subsets of data out of Postgres into wherever you
02:24want, like local apps and services.
02:26and we do also have another, project which we developed called PGlite,
02:30which is a lightweight WASM Postgres.
02:33So we can sync out of Postgres in the cloud, into Postgres in the web browser,
02:39or kind of into whatever you want.
02:40Awesome.
02:41So yeah, I want to learn a lot more about Electric as well as PGlite.
02:45Maybe PGlite a little bit towards the end of this conversation.
02:49So Electric, I've seen it a bunch of times.
02:53I've been playing around with it, I think quite a bit last year, but
02:59things seem to also change quite a bit.
03:01Can you walk me through?
03:03What was the history of like the last couple of years as you've been
03:08working on Electric and help me inform the right mental model about
Electric SQL
03:15Yeah, absolutely.
03:16I think like Electric as a project, it started, in a way, building on a bunch
03:24of research advances in distributed systems, CRDTs, transactional calls of
03:29consistency, a bunch of these primitives that a lot of people are building
03:33off in the local-first space, which actually a bunch of people on our team
03:38developed in the kind of research stage.
03:41And we wanted to create a developer tooling and a platform that allowed people
03:48who weren't experts in distributed systems and didn't have PhDs in CRDTs to be able
03:53to harness the same advances and build systems on the same types of guarantees.
03:58So in a way, that's where we started from.
04:00And we started building out on this research base into stronger consistency
04:05models for distributed databases and doing sync, from like a central
04:11cloud database out into whether it's to the edge or to the client.
04:16And then we're a startup.
04:17So like we built a small team and you go through this journey, building a
04:21company of, you have ideas for what's going to be useful and valuable for
04:26people, and you have a sense of sort of where the state of the art is and, what
04:29doesn't exist yet, but as you then go and experiment, you just learn more and more.
04:33And so you work out actually what people need and what
04:36problems you can solve with it.
04:38and so through that journey, we went from starting off thinking we were building
04:42a next generation distributed database to using the replication technology
04:48for that system behind existing open source databases like Postgres, SQLite,
04:53into finding, local-first software as a pattern is really the killer app for
04:58that type of replication technology.
05:00So people looking to build local-first applications because of all of the
05:04benefits around UX, DX, resilience, et cetera, but to do that, you
05:09need this type of sync layer.
05:11and then when we first focused on that, then we tried to build a
05:15very optimal end to end integrated local-first software platform.
05:19So for instance, if people saw Electric as a project, like this time last
05:23year, that's what we were building.
05:25And in a way we just found that we were having to solve too many problems and
05:30there was too much complexity making a kind of optimal one-size-fits-all sort of
05:34magic active active replication system.
05:37We were doing things like, managing the way you did the database migrations
05:40and schema revolution and generating a type safe client and doing the
05:44client side reactivity as well as all this sort of core sync stuff.
05:47So, as you know, there's a lot to that kind of end to end stack.
05:51Because we had wanted to build a system that integrated with people's
05:55existing software, like if you already had software built on Postgres or if
05:59you already had a working stack, like building that sort of full system was
06:05in a way sort of too complex and was difficult to adopt from existing software.
06:10So more recently we have consolidated down on building a much simpler sync engine,
06:17which is more like a composable tool that.
06:20You can run in front of Postgres, any Postgres.
06:23It works with any standard Postgres, any managed Postgres, any data model, any
06:28data types, any extensions that you have.
06:30And it just does this work of basically consuming the logical
06:35replication stream from Postgres.
06:37and then managing the way that the data is fanned out to clients,
06:41doing partial replication.
06:42So, because when you're syncing out, say, if you have
06:44a larger database in the cloud.
06:47And you're syncing out to like an app or a kind of edge service.
06:49You don't want to sync all the data.
06:51We have this sort of model of partial replication.
06:54And basically what we're aiming to do with the sync engine is just make that, as
06:58simple to use as bulletproof as possible.
07:01And we're making it with standard web technologies that make it easy
07:06to use with your existing systems and with your existing stack.
07:09And so we went in a way from this sort of quite ambitious, tightly integrated
07:13end to end local-first software platform to now building more like composable
07:18tools that can be part of a local-first stack that you would assemble yourself
07:22as a developer, that's designed to be.
07:25Easier to adopt for production applications that work
07:28with your existing code.
07:29That makes a lot of sense.
07:31And that definitely resonates with me personally as well, since maybe,
07:34as you know, before I founded Prisma, Prisma actually came as a pivot out of
07:40like a focusing effort from a previous product that was called GraphQL,
07:44which was meant as a more ambitious next generation backend as a service.
07:48Back then there was like Firebase and Parse and so we wanted to build the
07:52next generation of that, but what we found back then in 2016, that, while
07:57we've been making a lot of progress towards that very ambitious, holistic
08:02vision, we had to basically oil, like, multiple oceans all at the same time.
08:06And that takes a lot of time to fully get to all the different
08:10ambitious things that we wanted to.
08:12So the only way forward for us where we felt like, okay, we can actually
08:16serve the kind of use cases that we want to serve in a realistic timeline
08:21was to focus on a particular problem, which is what Prisma eventually became.
08:26And by focusing just on the database tooling part and leaving the other
08:31back-endy things to other people.
08:32And it sounds like what you've been going through with Electric is a very comparable
08:36exercise, like focusing exercise to trying to, from a starting point of
08:41like, let's build the most ambitious, the best local-first stack, like end to
08:46end by focusing more on like, okay, what we figured out where our expertise is,
08:52is around Postgres, is about, existing applications wanting to adopt local-first
08:58ideas, syncing approaches, et cetera.
09:01And that is what now led to the new version of Electric.
09:04did I summarize that correctly?
09:06Yeah, exactly.
09:07Right.
09:07It sounds like a very similar journey.
09:09And I think it's interesting as well that as you focus in and you learn
09:13more about a problem space, you both discover in a way, more of the
09:17complexity in the sort of aspects of it.
09:19So you realize there's actually more challenges to solve in a smaller sort
09:23of part of it or a smaller scope.
09:26And also it's interesting that I think for instance, when we started the
09:29project, I would have thought coming into this as a software developer,
09:32I'd go, Is a read path sync solved?
09:34I'd be like, well, there's quite a lot of read path kind of sync stuff.
09:37You can kind of do this.
09:38There's various real time solutions, but actually as you dig into it, you find
09:42that there's a whole bunch of weaknesses of those solutions and they're actually
09:45hard to adopt or they have silos or they can't handle the data throughput.
09:48And so you realize that actually you don't necessarily need to bite
09:53off all of the more ambitious scope because actually you can deliver
09:57value by doing something simpler.
10:00And I think also for me personally, learning about stewarding this
10:03type of product, understanding that you can build out still towards
10:08that more ambitious objective.
10:09So in the long run, you know, we want to sort of build back a whole bunch
10:12of capabilities into this platform.
10:14probably a sort of loosely coupled kind of composable tools.
10:18So you mentioned the term read path syncing.
10:21Can you elaborate a little bit what that means?
10:24So let's say I have an existing application.
10:26Let's say I've built an API layer at some point.
10:29I have a React front end and I have all of my data sitting in Postgres.
10:34I've been inspired by products such as Linear, et cetera, who seem to
10:38wield a superpower called syncing.
10:40And now I found ElectricSQL, which seems to connect the ingredients
10:45that I already have, such as Postgres and a front end with my
10:50desirable approach, which is syncing.
10:52So how does Electric fit into that?
10:55And what do you mean by.
Read and Write Path Syncing
10:59Yeah.
10:59I mean, the sort of read path and write path when it comes to
11:02sync, the read path is syncing data, like onto the local device.
11:06So it's a bit like kind of data fetching from the server.
11:09And then the write path would be when like a user makes a write, and then
11:12you want to sync that data typically back to the cloud so that's sort
11:16of how we talk about them there.
11:19I think there's something unique about local-first software compared to
11:25more sort of traditional web service systems where you explicitly have
11:31a local copy of the data on device.
11:34And one of the challenges with that is because of course you can just like load
11:39some data from the server and keep it in a cache, but if you do that Then you
11:44immediately actually lose, any information about whether that data is stale.
11:49So say a user goes to a route on your application and then clicks
11:54to go to another route and then comes back to the original one.
11:57So to load that original route, say you did a data fetch, but
12:01now you've navigated back to it.
12:02Can you display that data?
12:04Can you render the route or is the data stale?
12:08And so you have this sort of thing where I don't really know, and you tend to sort
12:12of build systems with like REST APIs and data fetching where you might show the
12:15data and go and try and fetch new data.
12:17but in a way it's that problem of you want the data locally so that your application
12:23code can just talk to it locally and you're not having to code across the
12:26network with local-first software.
12:28But that means that you need a solution to keep the data that is local fresh.
12:33Like you don't want stale data.
12:35And if you build a sort of ad-hoc system.
12:38As we've all done across like many generations of software applications,
12:41it's one of these things where you always end up kind of building some sort
12:44of system to keep the data up to date.
12:46But what you really want is a kind of properly engineered system
12:49that does it systemically for you.
12:51It is really a sort of an aspect of your applications architecture that kind of
12:56can be abstracted away by a sync engine.
12:58And so for us, for this focusing on the read path sync is about saying,
13:02okay, what data should be on the device and let's just keep it.
13:06fresh for you.
13:07And then with the write path, one of the things that we learned through
13:11the project is that there are a lot of valid patterns for handling how, when
13:17you do local writes on the device, how you would get those back to the cloud.
13:22You can do through the database sync, you can do optimistic writes.
13:26You could be happy with online writes and you have different models of
13:30like, can your writes be rejected?
13:32Are they local writes with finality?
13:34Or do you have a server authoritative system where when the write
13:37somehow syncs, it can be rejected and how do you handle that?
13:40And so there's actually a lot of different patterns for those writes,
13:43which are often relatively simple because different applications can
13:48be happy with certain trade offs and you could pick a model like.
13:51Okay.
13:51I'm going to show some optimistic state and make a request to an API server.
13:56And it's fine.
13:57And you get a kind of, you get a local-first, experience with just a
14:00sort of simple model that says, okay, if the write is rejected when it
14:03syncs, then, I'll just sort of roll it back and the user loses that work.
14:07And for many applications, that's fine.
14:09For other applications, you might have a much more complex conflict resolution or
14:13you're trying not to lose local writes and there's different collaborative workloads.
14:16And so.
14:17Building a generic system that can give you a write path that gives you
14:21the best developer experience and user experience for all of those variety of
14:25scenarios is very, very hard, whereas building it on an application by
14:28application basis on the write path is actually often fairly straightforward.
14:32It can be like post your API and use the React use optimistic hook.
14:37And so, with building local-first applications that have both read and
14:40write path with Electric, the idea is that we do this core read path
14:45with partial replication, but then as you're building your application, you
14:49can choose out of a variety, whichever pattern fits your, what you need the
14:53most for sort of how you would choose to get the writes back into the server.
14:57That makes a lot of sense.
14:58So basically the more general purpose.
15:01building block that can be used across a wide range of different applications.
15:05It's actually how you read data, how you distribute the data that you
15:09want to have locally available in your applications that would kind of
15:13replace the API get requests before.
15:17But now what needs to happen in those Put, post, delete requests,
15:21this is where it depends a lot more.
15:24And this is where you basically, what you're arguing is there are different
15:28sort of write patterns that heavily depends on the kind of application.
15:32So that is where you're kind of leaning out.
15:34And previously with Electric, you tried to provide the silver bullet there.
15:39But actually, it's really hard, maybe impossible to find the silver
15:43bullet that applies to all use cases.
15:45However, for the read path, it is very possible to provide a great building
15:50block that works for many use cases.
15:52So, can you provide a bit of a better spectrum of the different write
15:56patterns that you've seen so far?
15:58Maybe map them to canonical applications?
16:02that illustrate those use cases.
16:04And maybe if you know, maybe you can also compare analogies to something
16:08like Automerge, et cetera, which sort of write patterns that would
Read Path use cases
16:14Yeah.
16:15So I think the simplest pattern for writes with an application would be to
16:19just, for instance, send a write to a server and require you to be online.
16:24So, because there's many applications that are happy, for instance, with read
16:27only, like there's a lot of people who are building, data analytics applications,
16:31data visualization, dashboards, et cetera.
16:33And so if you have a sort of read heavy application, then in some cases
16:37it may just be a perfectly valid trade off, not to really deal with the
16:40complexity of say offline writes at all.
16:42But you still have a lot of benefits by having local data on device for the read
16:46path, because all the way you can kind of explore the application and the data is
16:50all just instant and local and resilient, then the sort of simplest pattern to
16:56layer on, support for offline writes.
16:59On top of that as a sort of starting point where imagine that you have like a
17:03standard REST API and you're just doing put and post requests to it as normal is
17:08to add this concept of optimistic state.
17:10So optimistic state is just basically you're saying, okay, I'm going to go and
17:14try and send this write to the API server.
17:16And whilst I do so, I'm going to be optimistic and imagine that
17:20that write is going to succeed.
17:22And in two seconds later, it's going to sync back into the state that I have here.
17:25But in the meantime, I'm going to Add this bit of local optimistic state to
17:30display it immediately to the user, and because in most cases that of happy path
17:34is what happens, then you end up with what just feels like a perfect local-first
17:39experience because it's an instantly displayed local write, and that sort
17:43of data is resolved in the background.
17:45Now, You know, immediately with that, you do then just introduce like a layer
17:49of complexity with like, well, what happens when the write is rejected?
17:54And so you have both the challenge of, for instance, say you stacked up three writes.
18:01Did they depend on each other?
18:03So if one of them is rejected, should you reject all of them?
18:06and different applications and different parts of the application would have
18:09different answers to that question.
18:11In some cases, like it's very simple to just go, if there's any problem with
18:14this optimistic state, just wipe it.
18:16And for instance, like the React use optimistic hook, like its approach is just
18:20like, it waits for a promise to resolve.
18:22And when the promise resolves, it wipes the optimistic state.
18:25And so it's very much just like, if anything happens at all,
18:28it's like, And so it's only.
18:30Interestingly enough, there's also a lot of people coming from React Query and so
18:35on, from those sort of more traditional front end state management things.
18:40and that brings them to local-first in the first place, because they're like
18:44layering optimistic, one optimistic state handler on top of the next one.
18:49And if there's a little flaw inside of there, everything collapses
18:53since you don't really know have principled way to reason about things.
18:57So that makes a lot of sense.
18:59Exactly right.
19:00And so like a framework like TanStack, for instance, with TanStack query, it has like
19:05slightly more sophisticated optimistic state primitives than just say the kind
19:10of a primitive use of optimistic hook.
19:12And one of the thing, one of the challenges that you have is that for
19:15say, a simple approach to, to just using optimistic state to display an immediate
19:20write is like, is that optimistic state global to your application?
19:24Shared between components?
19:25Is it scoped within the component?
19:27And so, as you say, like there's an approach where you could come along
19:30and say, okay, I've got three or four different components and so far I've
19:33just been able to sort of render the optimistic state within the component.
19:37But now I've got two components that are actually displaying the same information.
19:40And suddenly I've got like stale data.
19:42It's like the old days of manual DOM manipulation and you forgot
19:45to update a state variable.
19:47And so.
19:48Yeah, in a way that's where you come to a more proper local-first solution
19:53where your optimistic state would be, stored in some sort of shared store.
19:58So it could just be like a JavaScript object store, or it
20:01could be an embedded database.
20:03And so you get a slightly more sophisticated models of
20:07managing optimistic state.
20:08And the great thing is there are, like TanStack Query and others, there's
20:11like, there's a bunch of existing client side frameworks that can handle
20:14that kind of management for you.
20:17Once you go, for instance, like to an embedded database for the state.
20:21So one of the kind of really nice, points in the design space for this is to have a
20:27model where you sync data onto the device and you treat that data as immutable.
20:32And then you can have, for instance, so, so say, for instance, you're syncing a
20:37database table, say it's like a log viewer application, and you're just syncing the
20:41logs in, and it goes into a logs table.
20:44Now, say the user can interact with the logs and delete them,
20:47or change the categorization.
20:49And so you can have a shadow logs table, which is where you would
20:52save the local optimistic state.
20:54And then.
20:55You can do a bunch of different techniques to, for example, create a view or a live
20:59query where you combine those two on read.
21:02So the application just sort of feels like it's interacting with the table,
21:05but actually it's split in the storage layer into a mutable table for the sync
21:09state and a kind of local mutable table.
21:12And the great thing about that is you can have persistence for the, both the
21:15sync state and the, local mutable state.
21:18And of course it can be shared.
21:19So you can have multiple components, which are all sorts of just going
21:22through that unified data store.
21:24and there's some nice stuff that you can do in SQL world, for instance, to use
21:27like instead of triggers to combine it.
21:29So it just feels like you're working with a single table.
21:32Now it's a little bit additional complexity on something like defining
21:35a client side data model, but what it gives you is it gives you a
21:39very solid model to reason about.
21:42So like, You can go, okay, basically the sync state is always golden.
21:46It's immutable.
21:46Whenever it syncs in, it's correct.
21:48If I have a problem with this local state, that's just, that's like mutable stuff.
21:53Worst case, I can get rid of it, or I can develop more sophisticated strategies for
21:57dealing with rollbacks and edge cases.
22:00So it in a way it can give you a nice developer experience.
22:04with that model, you could choose then whether your writes are, whether you're
22:08writing to the database, detecting changes, and then sending those to
22:11some sort of like replication ingest point, or whether you're still just
22:15basically talking to an API and writing the local optimistic state separately.
22:21So, so at that point you can have, again, you can have, you have this
22:24fundamental model of like, Are you writing directly to the database and
22:27all the syncing happens magically?
22:29Or are you just using that database as a sort of unified, local optimistic store?
22:34So this is the sort of type of like progression of patterns.
22:36And once you start to go through something where you would, for instance, have a
22:42synced state that is mutable, or you are writing directly to the database,
22:46that's really where you start to get a little bit more into the world of like
22:49convergence logic and kind of merge logic and CRDTs and sort of what's commonly
22:54understood as proper local-first systems.
22:57And I think that's the point where almost the complexity of those
22:59systems does become very real.
23:01Like, as you well know, from building LiveStore and as we see from the
23:04kind of, quality of libraries like AutoMerge, Yjs, et cetera.
23:08so that's probably where as a developer, it makes sense to reach for a framework.
23:12And you certainly could reach for a framework for that sort of like.
23:15Combine on read, sync, sync into a mutable kind of persist local mutable state.
23:21But what we find is that it is actually if you want to, it's actually
23:25relatively straightforward to develop yourself, you can reason about it
23:28fairly simply, and so it's not too much extra work to just basically go
23:32as long as you've got that read sync primitive, you can build like a kind of
23:36proper locally persistent, consistent local-first app yourself, basically.
23:42Just using fairly standard front end primitives.
23:44Right.
23:45Okay.
23:46Maybe sharing a few reflections on this, since I like the way how you,
23:50portrayed this sort of spectrum of this different kind of write patterns.
23:54in a interview that I did with Matthew Weidner, I learned a lot there
23:58about the way, how he thinks about different categorizations of like state
24:02management, and particularly when it comes to distributed synchronization.
24:07and I think one pattern that got clear there was that there's either you're
24:12working directly manipulating the state, which is what like Automerge, et
24:16cetera, are de facto doing for how you as a developer interact with the state.
24:21So you have like a document and you manipulate it directly.
24:25You could also apply the same logic of like, you have a Database table, for
24:30example, that's how CR SQLite works, where you have a SQLite table and you
24:35manipulate a row directly and that is being synchronized as the state and
24:41you're ideally modeling this with a way where the state itself converges and
24:46through some mechanisms, typically CRDTs.
24:49But then there's another approach, which might feel a little bit more
24:53work, but it can actually be concealed quite nicely by systems, for example,
24:58like LiveStore, in this case, unbiased, and where you basically separate
25:02out the reads from the writes.
25:05And often enough, you can actually fully, re compute your
25:10read model from the write model.
25:12So, if you then basically express everything that has happened, that
25:16has meaningfully happened for your application as a log of events.
25:20Then you can often kind of like how Redux used to work or still works, you can
25:24fully recompute your view, your read model from all the writes that have happened.
25:29And I think that would work actually really, really well together in tandem
25:33with Electric, where if you're replicating what has happened in your Postgres
25:39database as like a log of historic events, then you can actually fully, recreate
25:45Whatever derived state you're interested in and what is really interesting about
25:49that approach, but that particular write pattern is that it's a lot easier to
25:54model that and reason about that locally.
25:57Did you say like, Hey, I got those events from the server, those
26:00events, I am applying optimistically.
26:03You can encode sort of even a causal order that doesn't really, If someone
26:09is like confused about what does causal order mean, don't worry about it.
26:13Like you can probably at the beginning, keep it simple, but once you layer
26:18on like more and more dependent, optimistic state transitions, this is
26:22where you want to have the information.
26:25Okay.
26:25If I'm doing that, and then the other thing depends on that, that's basically a
26:29causal order and modeling that as events.
26:32I think is a lot simpler and is a way to, to deal with that monstrosity of like,
26:38losing control over your optimistic state.
26:41Since I think one thing that's, that makes optimistic state management
26:44even more tricky is that, like, how are things dependent on each other?
26:50And then also like, when is it assumed to be good.
26:54I think in a world where you use Electric, once you're from the
26:57Electrics server, you've got sort of confirmation, like, Hey, those
27:01things have now happened for real.
27:02You can trust it.
27:04but there's like some latency in between, and the latency might be
27:07increased by many, many factors.
27:10One way could be that you just, you are on a like slow connection or the server
27:15is particularly far away from you and might take a hundred milliseconds, but
27:19another one might be your have a spotty connection and like packages get lost and
27:25it takes a lot longer or you're offline and being offline is just like a form
27:30of like a very high latency form and so all of that, like if you're offline,
27:36if it takes a long long time, and maybe you close your laptop, you reopen it.
27:41Is the optimistic state still there?
27:43Is it actually locally persisted?
27:45So there are many, many more layers that make that more tricky.
27:49But I like the way how you're like, how you split this up into the read
27:54concerns and the write concerns.
27:56And I think this way, it's also very easy to get started with new
28:00apps that might be more read heavy and are based on existing data.
28:05I think this is a very attractive trade off that you say like, Hey, with
28:09that, I can just sink in my existing data and then step by step, depending
28:14on what I need, if I need it at all.
28:16Many apps don't even need to do writes at all, and then you
28:19can just get started easily.
28:21Yeah, I think, I mean, that's explicitly a design goal for us is like, yeah,
28:25if you start off with an existing application and maybe it's using REST
28:29APIs or GraphQL, it's like, well, what do you do to start to move that
28:32towards a local-first architecture?
28:34And exactly, you could just go, okay, well, just, let's just leave the way
28:37that we do writes the same as it is.
28:39And let's move to this model of like syncing in the data
28:41instead of fetching the data.
28:43And that can just be a first step.
28:45And I think, I mean, Across all of these techniques for writes, there
28:48is just something fundamental about keeping the history or the log
28:52around as long as you need it, and then somehow materializing values.
28:58So sort of internally, this is what a CRDT does, right?
29:01it's clever and has a sort of lattice structure for the history, but basically
29:05it keeps the information and allows you to materialize out a value.
29:09if you just have like an event log of writes.
29:11So as you were saying with, with LiveStore, when you have like a
29:14record of all the write operations, you can just process that log.
29:17so I think, you know, you can do it sort of within a data type.
29:21And I think that fits as well for greenfield application where you're trying
29:25to craft, kind of real time or kind of collaboration and concurrency semantics,
29:29but like from our side of coming at it, from the point of saying, right, when
29:32you've got applications that build on Postgres, you already have a data model.
29:35You just sort of layer the same kind of history approach on top by like, keeping
29:39a record of the local writes until you of sure you can compact them and actually
29:44that same principle is exactly how the read path sync works with Electric.
29:49So Postgres logical replication, it just basically, it emits a stream, it's like
29:56transactions that contain write operations and it's basically inserts, updates,
30:00and deletes with a bit of metadata.
30:02And so we end up consuming that and basically writing
30:06out what we call shape logs.
30:07So we have a primitive called a shape, which is how we control the partial
30:10replication, like which data goes to which client and a client can define multiple
30:14shapes, and then you stream them out.
30:16But that shape log comes through our replication protocol as just that
30:21stream of logical update operations.
30:23And so in the client, you can just, you can materialize the data immediately.
30:28So like we provide, for instance, a shape stream primitive in a JavaScript client
30:32that just omits the series of events.
30:34And then we have a shape, which we'll just take care of materializing that
30:37into a kind of map value for you.
30:39but you could do what you want, whatever you wanted with that stream of events.
30:42So if you found that you wanted to keep around a certain history of the
30:46log in order to be able to reconcile some sort of causal dependencies,
30:49that's just totally up to you.
30:51And so, yeah, it's quite interesting that it's almost just the same approach,
30:54which is the general sort of principle for handling concurrency on the
30:58write path is also just exactly what we've ended up consolidating down on
31:02exposing through the read path stream.
31:04That makes a lot of sense.
31:05So, Let's maybe go a little bit more high level.
31:08Again, for the past couple of minutes, we've been talking a lot about like how
31:12Electric happens to work under the hood.
31:14And there's many commonalities with other technologies and
31:17all the way to CRDTs as well.
31:19But going back a little bit towards the perspective of someone who would
31:23be using Electric and build something with Electric and doesn't maybe
31:28peel off all the layers yet, but get started with one of the easier off the
31:32shelf options that Electric provides.
31:35So my understanding is that you have your existing Postgres database.
31:40you already have your like tables, your schema, et cetera, or if it's
31:44a greenfield app, you can design that however you still want.
31:47And then you have your Postgres database.
31:50Electric is that infrastructure component that you put in front
31:53of your Postgres database that has access to your Postgres database.
31:58In fact, it has access to the replication stream of Postgres.
32:02So it knows everything that's going on in that database.
32:05And then your client is talking to the Electric sync engine to
32:10sync in whatever data you need.
32:12And the way that's expressed what your client actually needs is through
32:17this concept that you call shapes.
32:19And my understanding is that a shape basically defines a subset
32:23of data, a subset of a table that you want in your client.
32:28since often like tables are so huge and you just need a particular
32:32subset for your given user, for your given document, whatever.
The role of Shapes
32:38Yeah, that's just exactly how it works.
32:40And.
32:41the Electric Sync Engine it's a web service.
32:44It's a Docker container, like technically it's an Elixir application.
32:47And it just connects to your Postgres as a normal Postgres client would.
32:52So you have to run your Postgres with logical replication enabled.
32:57And then we just connect in over a database URL.
32:59And so it's just as if you were like, imagine you're deploying a Heroku app,
33:03and it's sort of Heroku Postgres, and it just provisions a database URL, and your
33:06back end application can connect to it.
33:08So it's the same way that a sort of Rails app would talk to, talk to Postgres.
33:12And then Electric does some stuff internally to of route data into
33:16these shape logs, which are the sort of logs of update operations for each
33:21kind of unit of partial replication.
33:23And then we actually just provide a HTTP API, which is quite key to a whole
33:28bunch of the, affordances of the system.
33:31So I can dive into that if it's interesting.
33:33But then, yeah, you basically have a client, Which pulls data
33:37by just making HTTP requests.
33:39and so HTTP gives you back pressure and the client's in control of
33:44which data it pulls when, and then how you process that stream.
33:48Yeah, we do provide some primitives to make it simple.
33:51Like we give you React hooks to just sort of bind a shape to a state variable,
33:55but Basically, you can do what you like with the data as it streams it.
33:59So, yeah, I would love to learn more about that design decision of choosing HTTP
34:03for that network layer, for that API.
34:05Since I think most people think about local-first, think about real time
34:10syncing, et cetera, that reactivity.
34:13And for most people, I think particularly in the web, the mind goes to web sockets.
34:17So why HTTP?
34:19Wouldn't that be very inefficient?
34:21How does reactivity work?
34:23Can you walk me through that?
Why using HTTP for network layer?
34:25Yeah, so.
34:26I mean, exactly.
34:27We, went on that journey with the product where with the earlier, slightly more
34:30ambitious Electric that I was describing, we built out a custom binary WebSocket
34:36protocol to do the replication, and it's just what you sort of immediately
34:39think you're like, let's make it efficient over the wire and obviously
34:41it should be a WebSocket connection because you're just having these sorts
34:44of ongoing data streams, but, So one of the things that happened with the,
34:48focusing of the product strategy was that, Kyle Matthews joined the team.
34:52So Kyle was actually the founder of Gatsby, which is like the React framework.
34:57And through Gatsby, he did a lot of work around basically data
35:01delivery into CDN infrastructure.
35:04And so one of the insights that Kyle brought into the team was if
35:08we re engineered the replication protocol on plain HTTP, and we just
35:13do like plain HTTP, plain JSON.
35:16And we replicate over an old fashioned long polling protocol.
35:20So you just, basically we have a model where the client makes a request to a
35:24shape endpoint, and then we just return the data that the server knows about.
35:28So we'll sort of chunk it up sometimes over multiple requests, but it's
35:31just a standard, like load and load a JSON in a document request.
35:35And then once you get a message to say that the client is up to date
35:38with the server, then you trigger into a long polling mode where basically
35:41the server holds the connection open until any new data arrives.
35:45and yes, you kind of think instinctively like, okay, it's say JSON instead of
35:50binary, so it'll be less efficient and you're having to make these
35:52sort of extra requests that surely they add latency over some sort of
35:56more optimized, WebSocket protocol.
35:58But the key thing is that by doing that, it allows us to deliver the data
36:02through existing CDN infrastructure.
36:05So those initial data loading requests, like typically when you're building
36:10applications on this shape primitive, you can find ways of defining your shapes
36:15so that they're shared across users.
36:16You might have some unique data that's unique to a user, but Like say you have a
36:21project management app and there's various users who are all in the same project,
36:24you could choose to like sync the kind of project data down rather than just
36:28sort of syncing all the user's data down.
36:30And so that way you get shapes being shared across users.
36:33And so the first user to request it hits the Electric service, we
36:37generate these responses, but then they go through Cloudflare or Fastly
36:41or CloudFront or what have you.
36:43And every subsequent request is just served out of like
36:46essentially Nginx or Varnish.
36:48And so it's just super efficient.
36:50All of this infrastructure is just like super battle tested
36:52and as optimized as it can be.
36:54That is very interesting.
36:56It reminds me a little bit of like how modern bundlers, and I think even like
37:00all the way back to Webpack, used to split up larger things into little chunks.
37:06And those chunks would be content hashed.
37:08And that would be then often, be cached by the browser across
37:12different versions of the same app.
37:15In this case, it would be beneficial to the individual user who would reload it.
37:20And also of course, like to other people who visit this, but now you
37:24take the same idea, even further and apply it to data shared across users
37:29by applying the same infrastructure, HTTP servers, CDNs, et cetera, to make,
37:35things cheaper and faster, I guess.
37:38Well, and, and the local browser c or client cache as well.
37:41So you have this sort of shared caching within a CDN layer where you
37:45might have multiple clients, which are like, literally it's a sort of
37:48shared cache in the HTTP cache control.
37:50That makes a lot of sense.
37:50Since like, on a website level, I'm not sure whether you
37:53have clear caching semantics.
37:55I don't think so.
37:57Yeah, you'd have to do some very sort of custom stuff to
37:59sort of achieve the same things.
38:01but also because, so with the browser, when you're loading data, like HTTP
38:05requests with the write cache headers can just be stored in the local file cache.
38:09So one of the really nice things with just, like loading shape data
38:12through the Electric API is you can achieve an offline capable app without
38:16even having to implement any kind of local persistence for the data
38:20that's loaded into the file cache.
38:23So that sort of model, if like say you've gone to a page and you've just
38:26loaded the data through Electric, even if you didn't store the data, if you
38:30navigate back to the same page, the data's just there out of the file cache.
38:34So the application can work offline without even having
38:37any kind of persistence.
38:38So you almost get like, I mean, there's some sort of edge cases on this stuff,
38:41but it's the thing, because you're just working with the standard primitives,
38:44you've just got the integration with the existing tooling and you get a
38:47whole bunch of these things for free.
38:49That is very elegant and I guess that is being unlocked now because like
38:54you embrace the semantics of change of like how the data changes more and by
39:00modeling and this is where it now gets relevant again why everything here is
39:04modeled as a log under the hood since like to the log you just append and so
39:08you can safely cache everything that has happened up until a point in time,
39:12and from there on, you just add things on top, but that doesn't make the stuff
39:16that has happened before less valid.
39:18So you can cache it immutably.
39:20That makes it super fast.
39:21You can cache it everywhere on the edge, on your local device, et cetera.
39:25And that gives you a checkpoint that at least once in a point in time was
39:31valid, and now there might be more stuff that should be applied on top of
39:34it, but that's already a better user experience than not getting anything.
39:38I mean, another thing is like the operational characteristics of the
39:41system, for this type of sync technology.
39:44So, for instance, again, comparing HTTP with WebSockets, like
39:47WebSockets are stateful, and you do just keep things in memory.
39:51And so across, if you look across most real time systems, They have scalability
39:55limits because you will come to the point where if you have, say, 10, 000
39:57concurrent users, it's almost like, you know, it's like the thing of don't have
40:01too many open Postgres connections.
40:03But if you're holding open 10, 000 WebSockets, you may be able to do the
40:07IO efficiently, but you will ultimately be growing that kind of memory and
40:11you'll hit some sort of barrier.
40:12Whereas, with this approach, you can basically offload that
40:15concurrency to the CDN layer.
40:17So, it's not just about, being, basically taking away the query workload of the
40:23cached initial sync requests, but these kind of reverse proxies or CDNs have
40:27a really nice feature called request collapsing or request coalescing, which
40:31means that when they have a cache of requests come in on a URL, if they have
40:36Two clients making a request to the same URL at the same time, they sort of hold
40:41both of them at the cache layer and only send one request onto the origin server.
40:45And so basically we've been able to scale out now to 10 million concurrent clients
40:51receiving real time data out of Electric on top of a single single Postgres.
40:56And there is literally no CPU overhead on the Postgres or the Electric layer.
41:01It's just entirely handled out of the CDN CDN serving.
41:05And so it's sort of remarkable that the combination of the initial data
41:09load caching means that we, like one of our objectives is we want to be
41:13as fast as just querying the database directly for an initial data load
41:17and then orders of magnitude faster for anything that then subsequent
41:21requests coming out of the cache, but also this sort of challenge with.
41:25Almost like the, this thing about saying, okay, you're building an application.
41:29You maybe want some of the user experience or developer experience
41:32affordances of local-first, but if to do that, I need a sync engine and a
41:36sync engine is kind of a complex thing.
41:39And so you end up either going, okay, maybe I'll sort of use an external system.
41:44And then you get like, A siloed real time database in your main database
41:47and you get operational complexity, or you get some sort of system where
41:51you have, yeah, you're basically of stewarding these web sockets and
41:54it's very easy for it to fall over.
41:56And I think actually, like, if you just sort of honestly view that
42:00type of, architectural decision from the lens of like somebody trying to
42:04build a real project, which is their day job, trying to get stuff done.
42:08You're just going to avoid that as much as you can, because like
42:10you'd far rather just like, I just want to serve this with Nginx.
42:13I know how that's going to work.
42:14I'm not going to stay up at night worrying about it.
42:17Whereas I have 10, 000 concurrent users going through some crazy WebSocket stuff.
42:20I'm going to get pager alerts.
42:22And so like the whole approach here with what we're trying to do is to
42:26change that sense that sync is a complex technology that you sort of.
42:31Play with on the weekend and only adopt when you have to.
42:34So going, look, you can actually do sync in such a way that it is
42:37just as simple and standard as normal web service technology.
42:41And then suddenly you can actually unlock the ability for kind of real
42:44projects you know, you can take this stuff into a day job and not, get it
42:47shouted down at the design meeting.
42:49Cause it just feels like too much black box complexity.
42:52You're using the word simple here.
42:54And I think that really speaks to me now, because it's both simple in terms of
43:01architecturally, like, how does data flow?
43:04so I think this is where Electric provides a very simple and I think
43:09easy to use and easy to work with trade off, like, how does data flow,
43:14but then it's also gives a very simple answer of like, how does it scale?
43:19Since you can throw at it like all the innovations and all the hard
43:23work that has now gone into the like our web infrastructure for the last
43:27decades, you can run on the latest and greatest and all the innovations that
43:33Nginx and HAProxy and Cloudflare and like all the work that has into that.
43:39You can just piggyback on top of that without having to innovate on the
43:44networking side as well, since like you, you're really doing the hard work
43:48on the more semantic and data side.
43:51And that's a really, really elegant trade off to me.
43:54Yeah.
43:54And it's, it's fun because like our benchmarking testing at the
43:56moment, like we break CloudFlare before we break Electric.
43:59if something is battle tested, it's CloudFlare.
44:02It again, it carries on because it's not just about this sort of
44:05scalability or operational stuff.
44:06It's also about then how you can achieve, like we talked about the write patterns.
44:10And so this sort of pattern of how do you do writes?
44:12And it's like, well, actually you can do the sync like this, use
44:14your existing API to do writes.
44:16And it can work with your existing stack.
44:19But you have other obvious concerns with this type of architecture, like
44:22say, authentication, authorization, data security, encryption.
44:27But HTTP.
44:29just has proxies and it works with the sort of middleware stack.
44:33And so for us, a shape endpoint as a sync endpoint is just a HTTP resource.
44:40So if you want to just put like an authorization service in front of it,
44:43you just proxy the request through and you like, you have the context from
44:47the user, you can have the context about the shape and you can just
44:49authorize it using your existing stack.
44:52If you want to do encryption, then you can do that.
44:54It's just a stream of messages.
44:55And yeah, a bit like you were saying that, like with Electric, you could
44:58just use it as a transport layer to like, say, route a log of messages.
45:03That can be ciphertext or plaintext.
45:05So you could just like encrypt on device, sync it through.
45:08You can just decrypt whenever you're consuming the stream.
45:11And again, you could do that, like in the client, you could
45:13do that in HTTP middleware.
45:15So a lot of the sort of concerns, which, like certainly our experience of trying
45:20to build a more integrated end to end local-first stack, you know, you go,
45:24okay, we need to, we need to solve this.
45:25I need a security rule system because suddenly there is no API and how am
45:29I going to authorize the data access?
45:30And it's like, we don't need a security rule system.
45:33Because you can just use, you can just use normal API middleware
45:37in front of an HTTP service.
45:39And so you just sort of take that problem out of scope and like the
45:42system doesn't need to do encryption.
45:44It doesn't need to provide like a kind of hooks mechanism or some sort
45:47of framework extensibility because the protocol is extensible and just,
45:51you just have all of this ecosystem of existing tooling built around it.
45:55So it is, I mean, it's been fantastic for us because it, because it
45:59simplifies all of this aspects.
46:01And allows us to go, look, this is how you can achieve, say
46:03authorization with Electric, but again, it pushes it out of scope.
46:07So we get to focus our engineering resources on just doing the core stuff
46:11to deliver on this core proposition.
46:13So which sort of things would you say are particularly tricky from a application
46:19of all perspective with Electric, where it might be not as much of a good fit?
46:23I think, One of the things is that we sync through the database and that has latency.
46:31And so if you're trying to craft a really low latency real time multiplayer
46:36experience, like, or even doing things where in a way it doesn't really
46:41make sense to be, synchronizing that information through the database layer,
46:46then it's maybe not the best solution.
46:49So sort of for like presence features, let's say Infignar, where
46:54you see my mouse cursor moving around, those sort of things.
46:57yes, it would be nice if it was in real time shared across the various
47:01collaborators, but you don't need a persistent trace of that for
47:05eternity in your Postgres database.
47:07So I think a common approach for that as well is just to have like
47:11two kind of different channels for how your data flows, like your,
47:15persisted data that you want to actually keep around as a fixed trail.
47:19Like, did I create this GitHub issue or not?
47:22But like how my mouse cursor has moved around, it's fine that that's
47:26being broadcasted, but if someone opens it an hour later, it's fine
47:30that that person would never know.
47:32So for this sort of use case, it's an overkill basically
47:38to pipe that trough Postgres
47:38Yeah.
47:39And you know, it's.
47:39For us, Postgres is a big qualifier.
47:41It's like, if you, if you want to use Postgres, if you have an existing Postgres
47:46backed system, like Electric shines where like, yeah, you have, you already use
47:51Postgres or you know that you want to be using Postgres, maybe you already have a
47:54bunch of integrations on the data model already, maybe you do have existing API
47:58code, like this is the scenario where we're really trying to say, well, look,
48:02in that scenario, this is a great, pathway to move towards these more advanced
48:07local-first sync based architectures, where, whereas if you look at it from a
48:11sort of more greenfield development point of view, and you're trying to craft a
48:15particular concurrency semantics, say, you would reach for Automerge and you
48:20would get custom data types, which you can craft advanced kind of invariant
48:24support with your kind of data types.
48:27But of course, you know, so that's a slightly different sort of world.
48:30And, and I think so almost probably for sort of a lot of people in the local-first
48:35space dive into CRDTs and so forth, you know, it's really, it's fascinating
48:39to try to sort of craft these sort of optimized, kind of, present style,
48:44immediate real time streaming experiences.
48:47And so whilst we do real time sync, it's almost more about keeping the data fresh
48:52and just sort of making sure that the clients are sort of eventually consistent
48:56rather than making that more sort of game kind of experience where, you
49:00know, where maybe peer to peer matters more or of finding clever hacks to have
49:03very low latency kind of interactions.
PGlite
49:06That makes a lot of sense.
49:07So now we've talked a lot about Electric and Electric is the name of the company.
49:12It's the name of your main product.
49:14But there's also been a project that I'm not sure whether you
49:17originally created, but it's certainly in your hands at this point.
49:21It's called PGlite.
49:23That made the rounds on Hacker News, etc.
49:26Also through a joint launch with the folks at Superbase.
49:29What is PGlite?
49:31What is that about?
49:33Yeah, so I mean, interestingly with Electric, we started off, building
49:37a stack, which was sinking out of Postgres into SQLite because it
49:42made sense as the sort of main like embeddable relational database.
49:45and I remember, speaking to Nikita, who is the CEO at Neon, the Postgres database
49:50company, and some of his advice from building SingleStore or MemSQL was the
49:57impedance or the mismatch between the two database systems and the data type systems
50:02will continue to just be a source of pain for as long as you build that system.
50:06And so we were just having these conversations about going, how do we
50:09make this Postgres to Postgres sync?
50:11And then, You can just eliminate any mismatch.
50:15You just, you don't even need to do any kind of like serialization of the data.
50:19You can just literally take it exactly as it comes out of like the binary
50:23format that comes through in a query or the replication stream from Postgres,
50:26put that into the client and like, you can have exactly the same data
50:29types and exactly the same extensions.
50:31So this was a sort of motivation for us.
50:32And co founder Stas, the CTO at Neon had done an experiment.
50:37to try and make a more efficient Wasm builder Postgres that could
50:41potentially run in the client.
50:43So previously there'd been some really cool work by Superbase, by Snaplet, a
50:47few teams, which had developed these sorts of VM based, Wasm Postgreses.
50:52But they were pretty big.
50:53they didn't really have persistence.
50:54They weren't, they were sort of more of a kind of proof of concept.
50:57and the approach that Stas took was to do a pure Wasm build and
51:02run Postgres in single user mode.
51:04And that allowed you to basically remove a whole bunch of the concurrency
51:09stuff within Postgres, which allowed us to make a much, much smaller build.
51:13So they shared that repo.
51:15And we sort of, played with it for a little while.
51:18Didn't quite manage to kind of make it work.
51:20And then one of the guys on our team, Sam Willis, just picked it up one week
51:23and put in some concerted efforts and basically managed to pull it together
51:27with persistence as a three meg build.
51:30And it worked, and so suddenly we had this project which was like a three meg like
51:34SQLite for context is like a one meg WASM build, and so Postgres is much kind of
51:39larger system and you think it would be much bigger, but suddenly actually it's
51:41not that far off in terms of the download speed, and it could just run as a fully
51:46featured Postgres inside the browser.
51:48and so we sort of tweeted that out and it's gone a bit crazy.
51:50I think it's like, it's the fastest growing database project ever on GitHub.
51:54It's like 250, 000 downloads a week nowadays.
51:57There's a huge, there's lots and lots of people using it.
51:59Superbase are using it in production.
52:00Google are using it in production.
52:02Lots of people are building tooling around it, like drizzle integrations, et cetera.
52:06And it's the sort of thing that just should exist, right?
52:08There should be a WASM built at Postgres, just being able to have it
52:11like the same database system instead of mapping into an alternative one
52:15has these fundamental advantages, and also a lot of people have just been
52:21coming up with like a whole range of interesting use cases for it as a project.
52:25So some people are interested in running it inside Edgeworkers.
52:28As a sort of data layer that you can hydrate data into
52:31for kind of background jobs.
52:33Some people are interested in running it as just like a development database.
52:37So you can just NPM install Postgres.
52:38And if you're running like an application stack, you don't have to
52:41run Postgres as an external service.
52:43The same thing in your testing environment.
52:46So there's a whole bunch of different use cases.
52:48And in fact, like some of the work, for instance, the Superbase have
52:51done is they built a very cool project called database.build,
52:55which is a sort of AI driven database backed application builder.
53:00So it's sort of AI app builder for building Postgres backed
53:02applications, and it just runs purely on PGlite in the client.
53:07And so that's a demonstration where.
53:09this sort of database infrastructure for running software, you had
53:13centralized databases, and then you had this sort of move to serverless
53:16with separation of compute and storage.
53:18And now you sort of have this model where actually you can run the compute,
53:21with a whole range of different storage patterns in the client.
53:24And you don't even need to deploy any infrastructure on the server.
53:28to run database driven applications.
53:30it really reminds me of that time when JavaScript was
53:34getting more and more serious.
53:35And at some point there was no JS and suddenly you could run the same sort of
53:40JavaScript code that you were running in your browser, now also on the server.
53:45And well, the rest is history, right?
53:47Like that changed the web forever.
53:50It has like changed dramatically how JavaScript just become like
53:54the default full stack foundation for almost every app these days.
53:59And there seemed to be a lot of like similar characteristics.
54:02This time, the other way around, like going from the server into the world,
54:07Node, it was rather the other way around, but, that seems like a huge deal.
54:11Yeah, you know, you sort of step forward and we of see, I guess, some
54:15of these trends in data architecture and just, you know, it can just
54:19be the same database everywhere.
54:20And in a way, it's just sort of almost logically extended to wherever you want.
54:23And you almost like, you can just have this idea of like
54:28declarative configuration of what data should sit where.
54:31AI systems can optimize transfer and placement, and it is just
54:35all the same kind of data types.
54:37and I think, this is sort of where systems are moving to, but also
54:40just like some of these things we've been learning with PGlite, like for
54:44instance, if you're running a system that relies on having say a database
54:48behind your application and say it's a SAS system and you're spinning up some
54:51infrastructure for a client, With PGlite, you don't necessarily need to spin up a
54:55database in order to serve that client.
54:57So if you think about something like the free tier of like SaaS platform like that,
55:01it can just change the economics of it.
55:04it can do that on the server by just allowing you to have
55:06the Postgres in process.
55:08So you're not deploying additional infrastructure.
55:10But also you move it all the way into the client and there just is
55:13no compute kind of running on this.
55:15It just moves even more of the compute onto the client.
55:18And I think it like, it obviously aligns with sort of local-first in
55:21general, but I know some of the stuff we've talked about before around the
55:24concept of like local only first.
55:27And as a developer experience for building software, so one of the
55:30things that LiveStore is specifically designed to support is this ability
55:35to Build an application locally with very fast, feedback and iteration.
55:40And then you progressively add on, say, sync or persistence and
55:43sharing and things when you need to.
55:45And I think this sort of model of being able to build the software on
55:48a database like, PGlite and then go, okay, I've played with this enough.
55:52I want to save my work.
55:53And it's at that point that you write out to blob storage, or you
55:57maybe provision the database to be able to of save the data into.
56:00Yeah, I think you've touched on something really interesting and something really
56:04profound, which I think is kind of two second order effects of local-first.
56:09And so one of them is for the app users directly.
56:13So ideally it should just become so cheap and so easy to offer the full
56:19product experience as sort of like a taste, fully on the client that is
56:24no longer sitting behind a paywall.
56:26But if the product experience generally allows for that, if it's sort of like
56:30a note, note taking tool or something like that, that I should be able to
56:35like fully try out the app, on my device and doing the signup later and
56:41being able to offer that economically.
56:44That is basically with those new technologies, that's no longer
56:47an argument, so you can offer it.
56:49So hopefully that will be a second order effect where software is way easier to
56:54offer, where it's way easier to just try it out from an end user perspective.
56:59But then also from the second point, from an application developer
57:04perspective, I think it makes a huge difference in terms of complexity.
57:08How, when you build something, whether it is just a local script
57:12without any infrastructure, whether you can just run it, has no infra
57:16dependencies, you can just run it, maybe you run like your Vite dev server.
57:22And that's it.
57:22It's self contained and you can move on.
57:25There's like no Docker thing you need to start, et cetera.
57:29That's like your starting point.
57:31And if the barrier to entry there, if like, if that threshold is lower,
57:35that you can build a fully functional thing just for yourself, just in that
57:39local session, and you can get started this way, and if you then see like,
57:44Oh, actually, there's a case here that I want to make this a multiplayer
57:48experience or a multi tenant experience, then you can take that next step.
57:53But right now, like, you can't really, leap ahead there.
57:56You need to start from that multi tenant, that multi player experience,
58:00and that makes the, the entry point already so much more tricky that many
58:04projects are never getting started.
58:06And I think both of those, I think can be second order effects and
58:10improvements that local-first inspired architectures and software can provide.
58:16So, I love those observations.
58:18Yeah, yeah, totally.
58:20And I mean, I think, for instance, with, it's interesting as well that a
58:23lot of people do define their database schema using tools like Prisma, Drizzle,
58:29like Effect Schema is a great example that obviously you're working on.
58:33the more layers or indirection between where you're, say, iterating on the
58:37user experience in the interface, and you want to be able to, say, customize
58:40a data model to adapt to trying to sort of iterate there quickly.
58:44But if you have to sort of go all the way into some other language, another
58:47system, it just sort of takes you out of context and slows everything down.
58:50So that's somehow the ability to like, yeah, apply that sort of schema into
58:54the local database, not have to sort of work against these sort of different
58:59legacy layers of the stack in order to actually be able to build out
The relation between Electric and PGlite
59:03So going back to PGlite for a moment, how does PGlite and Electric, Electric
59:09as a product and Electric as a company, how do those things fit together?
59:14Yeah.
59:14I mean, there basically are sort of two main products.
59:18We have two products.
59:19They're both open source, Apache licensed.
59:22One is the Electric Sync Engine, and one is PGlite.
59:26And so you can use them together, or you can just use them independently,
59:31so it's not like the Electric system is designed only to sync into PGlite,
59:35you don't have to have an embedded Postgres to use it Electric, and
59:38you can use PGlite just standalone.
59:41There's a range of different mechanisms to do things like data
59:44loading, data persistence, et cetera, virtual file system layers,
59:48loading in, unpacking Parquet files.
59:51But if you do like have an application with this local database and you wanted to
59:56then be able to sync that data with other users or into your Postgres database,
59:59then Electric is just a great fit.
1:00:01And obviously we make a kind of first class integration.
1:00:04So I think for us, I mean, as a, as a company, as a startup, Electric is the
1:00:09main product that we aim to build the business around, because in a way that
1:00:14type of operational data infrastructure is just slightly more natural to build
1:00:18a commercial offering around, like you have to run servers to move the data
1:00:21around, we can do that efficiently, it sort of makes sense and adds value.
1:00:25Whereas with PGlite as a open source embedded database, it's not
1:00:29something that we're aiming to sort of monetize in quite the same way.
1:00:32And potentially, maybe it could be upstreamed into Postgres, like, you know,
1:00:37there should be a Wasm build to Postgres.
1:00:39or, you know, maybe it kind of moves into a, a foundation and sort
1:00:42of develops more governance, like certainly already with, PGlite.
1:00:47So like Superbase, co sponsored one of the engineering roles with
1:00:50us, there's been contributions from a whole bunch of companies.
1:00:53So it is already a sort of, wide attempt in terms of the.
1:00:56The stakeholders who are sort of stewarding the development of the project.
1:01:00That is very cool to see.
1:01:01I'm a big fan of those sort of like multi organizational approaches where you
1:01:06share the effort of building something.
1:01:09And, yeah, I love that.
1:01:11I'm very excited to get my own hands on PGlite as well.
1:01:14I'm mostly dealing with SQLite these days just because I think it is
1:01:18still a tad faster for like, those single threaded embedded use cases.
1:01:23But if you need the raw power of Postgres, which often you do, then
1:01:27you can just run it in a worker thread and you get the full power of Postgres
1:01:31in your local app, which is amazing.
1:01:34So maybe rounding out this conversation on something you just touched on,
1:01:38which is a potential commercial offering that Electric provides.
1:01:42can you share more about that?
Electric commercial offering
1:01:47Yep, so we're building, a cloud offering, which is basically
1:01:51hosting the Electric sync service.
1:01:53So like we, we, for instance, we don't host the Postgres database.
1:01:57We don't host your application.
1:01:59We just sort of host that kind of core sync layer, and then that can integrate
1:02:03with other Postgres hosts like Superbase, Neon, et cetera, and kind of other
1:02:07platforms for deploying applications.
1:02:09that's our sort of first commercial offering.
1:02:12And we of see that as like a almost sort of utility data infrastructure
1:02:17play, where we've put a lot of effort in being able to run the software
1:02:22very resource efficiently, and with sort of flat resource usage, so
1:02:26it doesn't you know, scale up with memory with concurrent users, etc.
1:02:30So we want to be able to run that very efficiently.
1:02:32And so, we, we sort of see that that's kind of, low cost usage based pricing
1:02:36based basically on the sort of data flows running through the software.
1:02:39I think, you know, monetizing open source software is quite a sort of,
1:02:43it's an interesting topic, but it's also sort of, there are a lot of,
1:02:45common patterns that are well known.
1:02:47And like, ultimately our aim as a company is, We want people building real
1:02:54applications with this technology, and we want developers to enjoy doing it
1:02:58and become advocates of the technology.
1:03:01And then, there is a pathway when, imagine that you're a large company
1:03:05and say you have like five projects and they're all using Electric sync.
1:03:09It's very common for those sort of larger companies to need
1:03:12additional tooling around that.
1:03:13So governance, compliance, data locality.
1:03:17There's a whole bunch of sort of considerations there.
1:03:19So, it's quite common to be able to build out a sort of enterprise offering
1:03:22on top of the core open source product.
1:03:25And so, you know, there are various routes like that, that we
1:03:27could choose to pursue in future.
1:03:29and maybe that's how it plays out as we build a cloud, we focus on, making
1:03:33this sync engine and these components bulletproof, make sure people are being
1:03:37successful building applications on them.
1:03:39And then we can look at maybe some sort of, value added tooling to help you
1:03:42operate them successfully at scale, or help you operate them within sort of
Outro
1:03:49That makes a lot of sense.
1:03:50Great.
1:03:51James, is there anything that you would want from the audience?
1:03:55Anything that you want to leave them with?
1:03:57anything to give a try over the next weekend?
1:03:59The holidays are upon us.
1:04:01what should people take a look at?
1:04:03Yeah, I know that, You may be listening to this at any time in
1:04:05future, but, we're recording this in the lead up to kind of December.
1:04:09So if you have some time to experiment with tech over the holiday period,
1:04:12just take a look at Electric.
1:04:14you know, it's ready for production use.
1:04:16It's well documented.
1:04:17There's a whole bunch of example applications.
1:04:19So there's a lot that you can of get stuck into there.
1:04:21So please do come along and check it like our website is electric-sql.com.
1:04:26we have a Discord community.
1:04:28There's about 2000 developers in there.
1:04:30So that's linked from the site.
1:04:32we're on GitHub at, Electric SQL.
1:04:34so you can see the Electric and the PGlite repos there.
1:04:37and so those are the kind of the main things.
1:04:39And if you're interested, for instance, in building applications, we already
1:04:43have a wait list for the new cloud service, and we're starting now to
1:04:46work with, some companies to help manually onboard them onto the cloud.
1:04:50So if a cloud offering for hosted Electric is important, let us know,
1:04:54and there's a pathway there to work with us if you're interested in being
1:04:57an early adopter of the cloud product.
1:04:59But also just, we spend a whole bunch of time talking to teams
1:05:02and people trying to use Electric.
1:05:04So our whole goal as a company is to help people be successful building on this.
1:05:09And so if you've got questions about.
1:05:11how best to approach it, challenges with certain application architecture.
1:05:14We're very happy to hop onto a call and chat stuff through.
1:05:16So if you come into the Discord channel, say hi and just ask any questions, and
1:05:21we're happy to help as much as we can.
1:05:22That sounds great.
1:05:24Well, I can certainly plus one that anyone who I've interacted with from your
1:05:28company has been A, very helpful and B, very, very pleasant to interact with.
1:05:34And also at this point, a big thank you to Electric, not just for building what
1:05:38you're building, but also for supporting me and helping me build LiveStore.
1:05:42You've been sponsoring the project for a little while as well, which I really
1:05:46much appreciate, and there's actually a really cool Electric LiveStore syncing
1:05:51integration on the horizon as well.
1:05:53That might be, some potential topic for a future episode, but I think with
1:05:58that, now we've covered a lot of ground.
1:06:00James, thank you so much for coming on the podcast, sharing a lot of knowledge
1:06:05about Electric and about PGlite.
1:06:07thank you so much.
1:06:08Yeah.
1:06:09Thanks for having me.
1:06:10Thank you for listening to the Local First FM podcast.
1:06:13If you've enjoyed this episode and haven't done so already, please
1:06:16subscribe and leave a review.
1:06:18Please also share this episode with your friends and colleagues.
1:06:21Spreading the word about this podcast is a great way to support
1:06:24it and help me keep it going.
1:06:26A special thanks again to Rosicorp and PowerSync for supporting this podcast.