hckrnws
Json columns are really useful. When prototyping applications in sqlite or postgres you can just dump everything into a json column, and as things begin to cool down and stabilise, you can migrate them into their own column.
Postgres goes even further than sqlite in terms of json functionality like aggregates.
And if you wanna go /even/ further you can add schemas to the json itself [1]
1: https://supabase.com/blog/pg-jsonschema-a-postgres-extension...
> Json columns are really useful. When prototyping applications in sqlite or postgres you can just dump everything into a json column, and as things begin to cool down and stabilise, you can migrate them into their own column.
I agree. That's precisely how I've been managing my proof of concept projects, and it's a treat. Previously I used ORMs for the same reason, but they ended up requiring too much extra work to setup and manage.
With JSON columns, it's a matter of getting the JSON payload and store it directly, and fetch the payload and massage it in any way.
If we're already getting JSON inputs and sending JSON outputs, spending time adding and managing a whole new component to map to/from an intermediate format is something that's very hard to justify.
What am I missing here? There's no special functionality related to the json being stored, it's just being saved and retrieved relative to some column values.
I was expecting something like how you can index on individual attributes like firebase.
You can use an expression index in SQLite to index a path inside a JSON column or anything else
I thought the point of nosql was O(1) GET and UPDATE operations
I think you are thinking "hashmaps", not nosql. There are some ways of persisting hash maps to disk [1], but the workhorse of most ACID DBs (including popular nosql DBs like mongo) are B-Trees, which have O(log n) complexity for all operations.
In any case when working with JSON columns in SQL it is useful to index the documents internal fields, which the OP doesn't talk about. Found a nice example in this blog post [2].
--
why use b trees if the keys are out of order?
DBs default to B-Trees even for synthetic primary keys because range queries are almost always necessary, for instance, if you want to paginate records. Some DBs also include a hash map index in addition to B-Trees, for instance Postgres [1], but still use B-Trees for primary keys. Sqlite3 doesn't support Hash indexes.
There's a very short chapter in "Modern B-Trees Techniques" [2] that compare B-Trees with Hash indexes:
* Durability schemes are simple and robust for b-trees, not so for hash maps.
* Ordering has advantages for concurrency control by means of key range locking.
* "Very few database implementation teams find hash indexes among the opportunities or features with a high ratio of benefit and effort, in particular if B-tree indexes are required in any case in order to support range queries and ordered scans."
--
1: https://www.postgresql.org/docs/current/indexes-types.html#I...
2: https://w6113.github.io/files/papers/btreesurvey-graefe.pdf
very nice thanks
By GET, do you mean “get all fields of the unique item with id=x”? Oltp (sql), mongo or cassandra are working similarly for this and have the same complexity. Even worse, the same operation on the same storage can have very different performance profile independently of the cluster size, making o(n) complexity a very secondary discussion. All technologies offer more or less the same data structures and algorithms at such a basic level. You have to dig into replication, disk layout, transactionality, constraints, … to start to see differences.
Comment was deleted :(
> I thought the point of nosql was O(1) GET and UPDATE operations
Tha was never the case. It would be silly to assume that given ultimately your GET and UPDATE operations are inter-process requests which involve serialization/deserializarion, and most of the time even require at least a network call.
network, encode/decode are still constant in terms of the dataset size. Still O(1).
O(1) doesn't mean "instant" it means "does not grow with the dataset"
I like this. You can also use triggers from other tables to trigger cross-table updates. SQLite is a gem.
i really dont get this
the entire point of design time is to understand and design your data first.
> the entire point of design time is to understand and design your data first.
What's hard to understand? You design a data store to store data, and aggregate data is still data. Do you care about low-level details of an aggregate data type? Do you care about how a UUID is implemented, or even if you store it as a string, a 128 bit integer, or a bitmap? Do you care if you store the UUID type along with the UUID?
Or do you simply want to store an ID that's produced/consumed by a client?
Now, replace UUID with JSON and you arrive at the same conclusion.
i do care
i care about how data is stored. There are limits to every data type and time spent on that in the beginning is time gained down the road
I'm not sure you understand why your comment makes no sense in the context of SQLite, as under the hood it stores most data types as text strings, whereas it supports storing JSON as JSONB.
https://sqlite.org/draft/jsonb.html
Therefore, even if you had any remotely relevant reason for that fundamentalism, storing JSON documents in SQLite does more to meet that personal arbitrary goal than storing it in tabular form.
Comment was deleted :(
My tips, start a project with an NoSQL store, and when you are certain about your schema, consider traditional SQL.
It’s so comfortable to iterate quickly without schema constraints
At that point, why not just use a SQL database with a JSONB column and then add columns as you slowly finalize your schema? :)
Comment was deleted :(
Does this ever actually happen? In my experience, by the time you realize you need regular SQL there is so much code that changing the database is a major project. It will also halt all other feature development.
I like the idea in theory but in practice having a big json blob that I need to refactor later has never felt great. Makes more sense to me to just do it right.
Yup. I tend to start with SQL tables as well - although I feel the pain during the heavy iteration phase.
I tend to nuke my entire db, recreate it and reload some data into it. I wish there was a tool that ran in watch mode and patched the db based schema edits.
> It’s so comfortable to iterate quickly without schema constraints
I'd be very careful about this simplistic reasoning. The whole point of schemas is data validation, and the hard part of "iterating quickly" is migrating data in a way you ensure it remains valid.
Neither of these requirements are tied to relational databases.
At most, breaking a schema ensures you cannot proceed with broken data without undergoing a migration, or completely wiping your database. But you are already doing that, aren't you?
rdbms with dynamic json column is totally different from nosql mindset and design patterns, if you are going to design your data in a relational matter but with dynamic attributes rdbms are still better at the job
Crafted by Rajat
Source Code