|
| jph wrote:
| The post leads to a good Supabase Postgres auditing tool as a PG
| extension named supa_audit:
|
| https://github.com/supabase/supa_audit
| simonw wrote:
| I've battled this problem many times over my career in many
| different ways - this approach looks very solid to me. I
| particularly like the way the schema design incorporates a record
| ID, old record ID, updated record and old record.
|
| You do end up storing a lot of extra data, but it's very thorough
| - it's not going to miss anything, and figuring out what happened
| later on should be very easy.
| lichtenberger wrote:
| I'd argue that's still a lot of work to manually do. However,
| great work and detail, thanks a lot :-)
|
| I'm working on a database system[1] in my spare time, which
| automatically retains all revisions and assignes revision
| timestamps during commits (single timestamp in the
| RevisionRootPage). Furthermore, it is tamper proof and the whole
| storage can be verified by comparing a single UberPage hash as in
| ZFS.
|
| Basically it is a persistent trie-based revision index (plus
| document and secondary indexes) mapped to durable storage, a
| simple log-structured append-only file. A second file tracks
| revision offsets to provide binary search on an in-memory map of
| timestamps. As the root of the tree is atomically swapped it does
| not need a WAL, which basically is another data file and can be
| omitted in this case.
|
| Besides versioning the data itself in a binary encoding similar
| to BSON it tracks changes and writes simple JSON diff files for
| each new revision.
|
| The data pages are furthermore not simply copied on write, but a
| sliding snapshot algorithm makes sure, that only changed records
| mainly have to be written. Before the page fragments are written
| on durable storage they are furthermore compressed and in the
| future might be encrypted.
|
| [1] https://sirix.io | https://github.com/sirixdb/sirix
| jmalicki wrote:
| Every google hit for "UberPage" seems to be your writing, is
| there standard ZFS terminology that would be found in ZFS
| documentation to what you're referring?
| lichtenberger wrote:
| It's in our case the tree root of the index, which is always
| written after all the descendant pages have been written as a
| new revision is committed (during a postorder traversal of
| the new pages).
|
| In ZFS the UberPage is called UberBlock. We borrowed some of
| the concepts as to add checksums in the parent pages instead
| of the pages itself. In ZFS they are blocks :-)
|
| Thanks for asking.
| ithrow wrote:
| Yes but sirix is not a SQL/RDBMS.
| lichtenberger wrote:
| True. In general I could add storing relational data as well,
| but currently I'm entirely focusing on JSON and auto-indexing
| for secondary indexes as well higher order function support
| in Brackit.
|
| Of course we'd need more man-power as I'm more or less the
| only one working on the core in my spare time (since 2012).
|
| Moshe mainly works on the clients and a frontend. A new
| frontend based on SolidJS is in the works showing the history
| and diffs as in
|
| https://raw.githubusercontent.com/sirixdb/sirix/master/Scree.
| ..
|
| However, we're of course looking forward to suggestions, bug
| reports, real world use cases and contributions :-)
| lichtenberger wrote:
| In order to provide fast audits of subtrees, the system stores
| optionally a merkle hash tree (a hash in each node) and updates
| the hashes for all ancestors automatically during updates.
| chatmasta wrote:
| You might like what we're doing with Splitgraph. Our command
| line tool (sgr) installs an audit log into Postgres to track
| changes [0]. Then `sgr commit` can write these changes to
| delta-compressed objects [1], where each object is a columnar
| fragment of data, addressable by the LTHash of rows
| added/deleted by the fragment, and attached to metadata
| describing its index [2].
|
| I haven't explored sirix before, but at first glance it looks
| like we have some similar ideas -- thanks for sharing, I'm
| excited to learn more, especially about its application of ZFS.
|
| [0] https://www.splitgraph.com/docs/working-with-
| data/tracking-c...
|
| [1] https://www.splitgraph.com/docs/concepts/objects
|
| [2]
| https://github.com/splitgraph/splitgraph/blob/master/splitgr...
| lichtenberger wrote:
| Very interesting, thanks for pointing out :-)
| oliverrice wrote:
| author here (of the blog post and
| https://github.com/supabase/supa_audit)
|
| happy to answer any questions
| simonw wrote:
| Are you at all worried about the size of the audit table?
| Keeping a JSON copy of both the old and the new data is going
| to add up pretty quickly for fast-changing tables.
|
| I've built audit systems in the past which track just the new
| values, with the idea being that you can replay the audit log
| for a particular record from the beginning to reconstruct its
| state. I'm not convinced that's a better approach than yours
| here though, just interested in your thinking.
| oliverrice wrote:
| > Are you at all worried about the size of the audit table?
| Keeping a JSON copy of both the old and the new data is going
| to add up pretty quickly for fast-changing tables.
|
| That's def a valid concern if you're auditing a high
| frequency insert/update table, or have your entire database
| under audit. If you do have those needs a system that logs
| outside of postgres (like pgaudit) would be a better fit.
|
| In my experience most startups and mid-size enterprises
| sprinkle in auditing around the sensitive parts of the DB,
| like `account` or `access_control` tables where writes load
| isn't much of a concern.
|
| > I've built audit systems in the past which track just the
| new values
|
| Yeah, that solution works great! The one place it breaks down
| is if you apply auditing to an existing table, and need to be
| able to recover the records from when auditing was enabled
| (initial state is not snapshotted)
|
| We toyed with another approach to avoiding having to track
| old_record by first checking to see if the `record_id` exists
| in the audit table, and then inserting a row with the OLD
| values with a `SNAPSHOT` `operation` if it does not.
|
| Even though that query/check was operating on an index, the
| performance overhead was higher than we were comfortable
| with.
|
| Storage is pretty cheap these days so we opted to optimize
| for write throughput + reduced memory usage rather than any
| considerations wrt disk.
| jsmith99 wrote:
| I've also used an update trigger to track only changed
| columns, inspired by this post[0] which wraps a comparison
| function into a new subtract operator for JSONB, similar to
| the built in subtract operator for HSTORE which removes any
| keypairs that haven't changed.
|
| [0] http://8kb.co.uk/blog/2015/01/19/copying-pavel-stehules-
| simp...
| oliverrice wrote:
| thats a cool solution for reducing storage. nice!
| craigkerstiens wrote:
| Building a bit off Simon's above question. I'm curious if any
| thought was given to using JSON over JSONB. The trade off of
| really fast ability to ingest them and save them seems like
| it could be better than JSONB which has to do some conversion
| and if there are any indexes on the JSONB columns could
| really dramatically slow down throughput.
| oliverrice wrote:
| > I'm curious if any thought was given to using JSON over
| JSONB
|
| For sure! There is a conversion penalty that is paid
| incrementally (at write time). For paying that penalty,
| jsonb gets you reduced storage size and dramatically faster
| column value extraction with `->` or `->>` if you end up
| having to filter the audit table by something other than
| the `record_id`
|
| It is a tradeoff though and depending on the specifics of
| your use-case JSON could be a better choice.
| netcraft wrote:
| this is great! I've done something similar many times, but the
| ability to do it on arbitrary tables is really nice.
|
| If you have the HSTORE extension, you can also use it to get a
| diff of the old and new: `HSTORE(OLD) - HSTORE(NEW)` and vice
| versa, very nice to easily have the deltas.
| perlgeek wrote:
| First of all, this looks super awesome, and at least an order
| of magnitude better than our home-grown solution (grown over
| 25+ years on a DB that is now mariadb).
|
| How well does this work when you audit multiple tables with
| foreign key relationships between them?
|
| If the user references a customer, and you want to find out
| what customer the user belonged to at time X (but the customer
| might have been deleted in the mean time), you have to track
| through the history of multiple tables to find answers. More if
| you have n2m relationships...
|
| It sounds doable, but not trivial to get right for more
| complicated models.
|
| ---
|
| Update: to clarify, I want to be able to present a somewhat
| homan-readable history of an object without doing much table-
| specific logic.
|
| If an author can publish many books, you typically have an n2m
| mapping table, like book_authors. If you want to add newly
| published books to the "author" history view, you have to
| determine than you have to include parts of the history of the
| book_authors table. I don't think you can do that through a
| join though? Trying to think my way through this...
|
| Maybe this can be dealt with easier if you can assume each
| table has a non-composite primary key, and include that primary
| key in the history table, and then you can do joins more easily
| on the history?
| oliverrice wrote:
| All the data would be available and joining it would be
| possible with some jsonb gymnastics but its optimized more
| for compliance style auditing than rapid lookups for complex
| queries.
|
| For that scenario replicating audit data to a OLAP system may
| be a better fit
| simonw wrote:
| I'm not completely clear on the reason for both record_id and
| old_record_id - is this so that you can capture instances where
| one of the columns that makes up the primary key for the row is
| itself modified for some reason?
| oliverrice wrote:
| Being able to see when the primary key changes using
| `record_id` and `old_record_id` is a perk, but we also wanted
| to make sure a snapshot of the existing data was available
| when auditing is enabled on a table that already has data in
| it. See comment ^ for more color
|
| Long story short: it is avoidable if you're okay with paying
| the penalty of querying the audit table by `record_id` before
| every write. We decided to use more disk and keep the higher
| write throughput
| bob1029 wrote:
| We have a new business system we are trying to build that has a
| _lot_ of very granular auditing and versioning requirements
| throughout, and this is where I started looking.
|
| After a while of screwing with this kind of approach, I realized
| that the number of different types of objects involved was going
| to cause an explosion in the number of related auditing and
| versioning tables. Keeping track of not only who changed
| something but also the exact facts they changed (such that you
| could revert as needed) requires a lot of additional boilerplate
| in each instance.
|
| So, I made a pretty big gamble and went all-in on an event-
| sourced abstraction where everything is written to 1 gigantic
| append-only log of events. Turns out, this was worth it in our
| case. As long as everything has a timestamp and you have a way to
| refer to specific log entries (i.e. by offset), you basically
| have a complete solution. The hard parts are handled by a JSON
| serializer and gzip. We use in-memory indexes that are simply
| long/long dictionaries that map an object identity to a physical
| log offset.
|
| The only downside with what I have proposed is that there is no
| garbage collection or cleanup possible. At this time, this is
| actually a highly-desirable property of the system. Disk space is
| cheap. Losing 3 weeks of work to an accidental button click is
| not.
| infogulch wrote:
| I've done this before on smaller scales. Writing a set of
| table-valued functions that expose the event data in relational
| form "as of date given by param X", and views on top of that
| where X is set to getdate(), you get a very ergonomic way to
| consume "current" & "historical" data. Materialized views can
| make querying these more natural relations very cheap if your
| engine supports them.
| pstuart wrote:
| > everything is written to 1 gigantic append-only log of events
|
| Do you replicate that stream to avoid SPOF?
| bob1029 wrote:
| I am currently investigating synchronous replication of each
| compressed log batch to a 2nd host. We are totally fine with
| availability issues (i.e. system down because log replication
| broken). Availability problems are far preferable to
| consistency and other nasty bugs.
| lichtenberger wrote:
| As posted in another comment that's basically what
| https://github.com/sirixdb/sirix supports along with easily
| reconstructing former revisions of a JSON document (or any node
| therein), sophisticated secondary (also versioned) indexes,
| querying with JSONiq and optimizations at query compile time as
| for joins and aggregates....
|
| In our case we opted for three types of secondary user-defined
| indexes (besides the main document trie index to find unique,
| stable nodes by their 64bit ID), namely
|
| 1. field indexes
|
| 2. path indexes
|
| 3. content-and-structure indexes which index paths and their
| typed values
|
| Furthermore a path summary keeps track of all distinct paths in
| the JSON resource.
| scwoodal wrote:
| I didn't see this mentioned in the article but with JSONB you can
| leverage operators to know what data changed. >
| -- record = {"id": 1, "name": "bar"}, old_record = {"id": 1,
| "name": "foo"} > select record - old_record where id=3;
| name => bar >
| netcraft wrote:
| I dont think this is built in, I think you must have an
| extension or custom operator providing that:
| https://dbfiddle.uk/?rdbms=postgres_14&fiddle=273a440d043a82...
| scwoodal wrote:
| Ah yes, I was thinking HSTORE.
|
| https://dbfiddle.uk/?rdbms=postgres_14&fiddle=2f406630f367b5.
| ..
| efxhoy wrote:
| This SO answer has a good jsonb diff function:
| https://stackoverflow.com/a/36043269
| craigkerstiens wrote:
| Love the detail and approach here, though (and it is mentioned in
| the bottom of the article) this shouldn't be used at any very
| large scale... the estimate of at about 1k transactions per
| second or less seems about right. For any larger scale you want
| to look at something like pg_audit which can scale much much
| further and be used with other extensions to ensure you have a
| full audit log that can't be modified (when dealing with
| compliance).
| sususu wrote:
| I really like this approach, however I have a use case where the
| application user id that made the operation must be saved in the
| audit table, unfortunately, I cannot see how to do that with a
| pure SQL solution.
|
| Has anyone done something similar with SQL only?
| oliverrice wrote:
| if `user_id` is in the table you're auditing you could do
|
| ``` alter table audit.record_version add column user_id bigint
| default (coalesce(record, old_record) ->> 'user_id'); ```
|
| or if you meant the SQL `current_user` you could extend the
| trigger to track it.
|
| but if the user_id is only available in application space (not
| in the db) it will not be possible using straight SQL
| timwis wrote:
| PostgREST has an interesting approach to authentication that
| might give you some ideas:
|
| https://postgrest.org/en/stable/auth.html
|
| Supabase is built on top of PostgREST, but I can't say for sure
| it uses the same means of authentication (I think it has its
| own Auth layer).
| oliverrice wrote:
| Supabase passes all the JWT info to postgrest so it is still
| available in transaction local config
| netcraft wrote:
| we do that by setting a local variable before the query and
| then reading that in the triggers:
|
| ``` SET LOCAL foo.who_id = 'some-uuid';
| UPDATE table SET ...
|
| ```
|
| ``` -- function to make getting the setting
| easier DROP FUNCTION IF EXISTS get_who_id (text);
| CREATE OR REPLACE FUNCTION get_who_id (default_value text
| DEFAULT null::text) RETURNS text AS $get_who_id$ DECLARE
| who_id text; BEGIN BEGIN who_id :=
| current_setting('foo.who_id'); EXCEPTION WHEN
| SQLSTATE '42704' THEN RETURN default_value; END;
| IF (length(who_id) = 0) THEN RETURN default_value;
| END IF; return who_id; END $get_who_id$
| LANGUAGE plpgsql VOLATILE;
|
| ```
|
| ``` CREATE OR REPLACE FUNCTION some_table_audit
| () RETURNS TRIGGER AS $some_table_audit$ DECLARE
| who_id text; BEGIN who_id :=
| get_who_id(null::text); ...
|
| ```
|
| Identifiers changed, but hopefully will give you the idea.
| tda wrote:
| I've experiment with a very similar solution and it felt a
| bit dirty but so far it seems to be working just fine. I have
| made an integration for auditing with sqlmodel which I intend
| to share, but it is kind of rough and I was a bit stuck
| trying to clean it up. The idea is that you add e.g. a
| HeroHistory model derived from HeroBase and a HistoryMixin
| that creates the triggers and relationships to have an Audit
| log of the Hero table.
|
| If anyone is interested give me a shout out
| sususu wrote:
| That's really interesting and gave me some (hopefully) good
| ideas.
|
| Thank you very much!
| raimille1 wrote:
| This is a great technique, we used to have it setup very
| similarly at a previous startup and worked wonders. Thanks for
| sharing a generic way of doing it @oliverrice !
| ithrow wrote:
| Debezium can be used for this in embedded mode if you are on the
| JVM.
| johndfsgdgdfg wrote:
| Can you please elaborate? Last time I checked it needed Kafka
| to work.
___________________________________________________________________
(page generated 2022-03-09 23:00 UTC) |