[HN Gopher] Foundations of Databases (1995)
___________________________________________________________________
 
Foundations of Databases (1995)
 
Author : harperlee
Score  : 167 points
Date   : 2021-04-14 16:22 UTC (6 hours ago)
 
web link (webdam.inria.fr)
w3m dump (webdam.inria.fr)
 
| ExcavateGrandMa wrote:
| Behavior DB, let's call it that way :D
 
| macando wrote:
| It's incredible how little attention is paid to data modeling and
| querying in the education of people entering the software
| engineering workforce.
| 
| Getting your database model right, on the logical and physical
| level, will make developing and deploying any data-driven app
| simpler and easier. Getting it wrong? No modern programming
| language or architectural pattern will save you from the worst
| kinds of bugs, workarounds and bad performance.
| 
| Do yourself a favor and read this or any other legendary DB book.
 
  | adamnemecek wrote:
  | > this or any other legendary DB book.
  | 
  | What are the other legendary DB books?
 
    | macando wrote:
    | https://www.amazon.com/Modeling-Essentials-Third-Graeme-
    | Sims...
    | 
    | My favorite. Not an easy read, it took me weeks to complete
    | it.
    | 
    | https://www.amazon.com/Date-Database-
    | Writings-2000-2006-Chri...
    | 
    | Relatively unknown but great book if you want to read the
    | thoughts of a DB heavyweight Christopher Date who
    | collaborated with Edgar Codd.
    | 
    | https://www.amazon.com/NoSQL-Distilled-Emerging-Polyglot-
    | Per...
    | 
    | The best book to start exploring the land outside the
    | relational world. An easy read, can be completed in two to
    | three days.
 
  | vishnugupta wrote:
  | Copy pasting my comment[1] from an earlier thread.
  | 
  | A couple of years ago I spent quite some time trying to
  | evaluate the tech stack (and general engineering culture) of
  | merger/acquisition targets of my employer. It was quite a fun
  | exercise, all said and done. I encountered all sorts; from a
  | small team start up who had their tech sorted out more or less
  | to a largish organisation who relied on IBM's ESB which exactly
  | one person in their team knew how it worked!!
  | 
  | I discovered this exact method during the third tech evaluation
  | exercise. When the team began explaining various modules top-
  | down and user-flows etc., I politely interrupted them and asked
  | for DB schema. It was just on a whim because I was bored of
  | typical one way session interrupted by me asking minor
  | questions. Once I had a hang of their schema rest of the
  | session was literally me telling them what their control and
  | user flows were and them validating it.
  | 
  | Since then it's become my magic wand to understand a new
  | company or team. Just go directly to the schema and work
  | backwards.
  | 
  | Conversely, I've begun paying more attention to data modelling.
  | Because once a data model is fixed it's very hard to change and
  | once enough data accumulates the inertia just increases and
  | instead if changing the data model (for the fear of data
  | migration etc.,) the tendency is to beat the use cases to fit
  | the data model. It's not your usual fail-fast-and-iterate
  | thing.
  | 
  | [1] https://news.ycombinator.com/item?id=24137997
 
    | macando wrote:
    | > I politely interrupted them and asked for DB schema.
    | 
    | In order of importance:
    | 
    | 1. DB schema
    | 
    | 2. List of dependancies
    | 
    | 3. List of 3rd party integrations
    | 
    | Depending on the domain, 2. and 3. can be switched.
 
    | macintux wrote:
    | Many years ago, my employer was transitioning ERP systems
    | from...something CA-owned whose name escapes me and I'd
    | probably have pretty awful memories dredged up if I actually
    | went looking for it.
    | 
    | Anyway, we were lucky enough to have a very talented intern
    | who was assigned the task of understanding the database
    | schema behind the CA product in order to migrate our data to
    | the new system.
    | 
    | It was...horrific. Absolute nightmare of spaghetti. I'd like
    | to say I've seen worse, but I've never seen anything even in
    | the same ballpark.
    | 
    | I think we finally gave up after weeks of digging into it and
    | just started over with a mostly clean slate.
 
  | msluyter wrote:
  | To wit: I made it through a master's in CS without a database
  | class.
  | 
  | This reminds me of the famous Rob Pike quote:
  | 
  | "Data dominates. If you've chosen the right data structures and
  | organized things well, the algorithms will almost always be
  | self-evident. Data structures, not algorithms, are central to
  | programming."
  | 
  | I've often found that if I'm coding something and the code
  | starts looking increasingly gnarly, that rethinking the data
  | structures / data model will clean up the code.
 
    | gautamdivgi wrote:
    | We had two books when I was doing my bachelors in CS many
    | moons ago (late 90's). One was by Silberschatz & Galvin and
    | the other by Ullman. The first one was used for relational
    | algebra, various normal forms and introducing table design,
    | keys, constraints,etc. The second was used to teach the
    | theory for internals on how a DB is implemented - B+ trees,
    | deadlock handling, etc, etc. That was a hard course and it
    | was mandatory.
    | 
    | I can see not electing to have databases as a course for
    | masters though. Masters is to allow you to specialize and be
    | more choice driven.
 
    | andrewl wrote:
    | On that topic, Fred Brooks, author of _The Mythical Man
    | Month,_ said  "Show me your flowchart and conceal your
    | tables, and I shall continue to be mystified. Show me your
    | tables, and I won't usually need your flowchart; it'll be
    | obvious."
 
    | slver wrote:
    | "Data structures, not algorithms, are central to programming.
    | 
    | Can we unify this to simply "data structures are essential to
    | algorithms". It's really weird to put these in opposition,
    | when algorithm with no data and data without algorithm make
    | no sense.
    | 
    | Data structures are encoded with use cases in mind, those use
    | cases at least at the very low level are their algorithms.
 
      | Twisol wrote:
      | I think programming is more than just algorithms. System
      | design and architecture have a real impact on the longevity
      | and maintainability of your system, and I'd argue that data
      | models (not _structures_ per se, the distinction being
      | logical vs. physical) are foundational to a good design. In
      | contrast, algorithms play a much more focused role.
      | 
      | Put differently:
      | 
      | > Data structures are encoded with use cases in mind, those
      | use cases at least at the very low level are their
      | algorithms.
      | 
      | As you've lampshaded, it's the use cases that are most
      | fundamental. The data model should be designed to serve
      | those use cases. The data structures and algorithms reflect
      | the physical reality of the logical data model.
 
    | macando wrote:
    | > "Data dominates. If you've chosen the right data structures
    | and organized things well, the algorithms will almost always
    | be self-evident. Data structures, not algorithms, are central
    | to programming."
    | 
    | I'm saving this quote.
    | 
    | > I've often found that if I'm coding something and the code
    | starts looking increasingly gnarly that rethinking the data
    | structures / data model will clean up the code.
    | 
    | I've faced this over and over again. Writing algorithms is
    | hard when the underlying data is not optimal. It simply
    | invites writing complicated code and workarounds.
 
    | AlphaSite wrote:
    | Although it says data structures, isn't it more important to
    | talk about the structure of the data than the choice of data
    | structure?
 
    | bshipp wrote:
    | every one of my data scraping projects is littered with files
    | titled "database.db.bak1", "database.db.bak2".... for this
    | exact reason. "Oh I scraped 1000 pages and realized I missed
    | an entire nested data structure....better blow away the dB
    | and try again.
 
  | achn wrote:
  | This is very true, but also a hugely unfortunate reality. It
  | amazes me that we have reached this far with "object" graph
  | oriented data models backed with relational DBs. They are very
  | seldom conducive to one another.
 
| rgbimbochamp wrote:
| Plugging in Andy Pavlo's Database lectures @ CMU which are
| completely free on Youtube. Great guy and great lectures.
 
  | alistairw wrote:
  | Agreed. I finished watching them at the start of this year and
  | it's significantly helped me in using and making choices about
  | databases. I've also since started implementing my own database
  | for learning off the back of them. Couldn't be more grateful
  | for those courses being public.
  | 
  | link if anyone is interested: https://youtu.be/oeYBdghaIjc
 
  | golergka wrote:
  | And hands down best music choice of all the CS lectures I've
  | ever seen.
 
  | tinmandespot wrote:
  | Seconded. I have been watching them over the past 2-weeks -
  | he's a brilliant guy and a great teacher.
 
| jenkstom wrote:
| We don't need "more theory". The vast majority of computer
| science theory goes unapplied in solving real-world problems.
| While I'm sure this is great for postgrad computer science
| students, it's not very useful for the every day programmers out
| there.
 
  | dudeman13 wrote:
  | >We don't need "more theory". The vast majority of computer
  | science theory goes unapplied in solving real-world problems
  | 
  | The sheer amount of rewriting that is being done every day
  | around the world because people didn't bother understanding
  | what they were doing (and how they should have thought more
  | before trying to make it work) disagrees with this.
 
  | ggambetta wrote:
  | ...and this is how we ended with NoSQL.
 
  | marcodiego wrote:
  | >We don't need "more theory".
  | 
  | >it's not very useful for the every day programmers out there
  | 
  | So, programmer's talent is not being used to their fullest. The
  | ones which can't understand basic theory can be easily
  | condemned to always be mere users of tools while never
  | improving them. They run a greater risk of becoming obsolete or
  | devalued.
 
| anonymousDan wrote:
| Anyone recommend a good book specifically on distributed
| databases (not more general distributed systems stuff like e.g.
| klepmann's DDIA)?
 
  | jkaptur wrote:
  | In case someone isn't aware, Designing Data-Intensive
  | Applications is a very good introduction to distributed
  | databases, even if it doesn't specialize in them.
 
    | anonymousDan wrote:
    | Yeah I don't disagree, I've just read it already :)
 
    | artificial wrote:
    | I'll second this. It's an excellent intro that's very
    | approachable.
 
  | convolvatron wrote:
  | I got alot of value out of                 Distributed
  | Databases: Principles and Systems       Stefano Ceri, Giuseppe
  | Pelagatti       McGraw-Hill, 1985
  | 
  | it adopts a very specific approach, and is obviously missing
  | some newer techniques, but it certainly leaves you with a
  | feeling of 'yeah, i can build this'
 
  | e12e wrote:
  | Not a book... But foundationdb documentation (and source) might
  | be worth looking at? https://apple.github.io/foundationdb/
 
  | pcthrowaway wrote:
  | Not a book, but the podcast series "Scaling Postgres" looks
  | really good.
 
  | throwaway823882 wrote:
  | Not a book, but I would add to the foreword: "Try not to use
  | them."
 
    | victor106 wrote:
    | Genuinely curious: What are some challenges with using
    | distributed databases?
 
      | biggestdummy wrote:
      | Tons of complexity behind this, but the simple problem is
      | source of truth. When you store data in multiple places, if
      | they become unsynchronized, which is the true answer???
      | Within a monolithic system, you don't have differing
      | clocks, network partitions that can be very weird, etc.
      | And, a related problem, which is race conditions. What if
      | you are reading one copy of the data while the other
      | replica is being updated. Most complexity, imo, arises from
      | this fundamental problem and the various techniques to deal
      | with it.
 
    | bshipp wrote:
    | This is a valid comment that I'd expand to say "...without
    | exhausting all the tools available."
    | 
    | For example, if you're facing lengthy search queries and are
    | looking at partitioning the database to speed them up, many
    | databases include helpful internal tools that should be
    | attempted first. Proper indexing is a textbook all on it's
    | own, and the appropriate design of full-text search
    | dictionaries and queries is another.
    | 
    | The one that shocked me a year or two ago was implementing
    | full-text search in a gigantic sqlite DB. I was preparing to
    | migrate it to postgresql or elasticsearch because my
    | traditional "ilike" queries, even indexed, were ridiculously
    | slow, and I was not aware FTS was an included library.
    | 
    | Although FTS pretty much doubled the size of the DB, a three
    | minute query dropped to a few seconds. Since read queries are
    | non-blocking in sqlite, this allowed me to continue using the
    | DB without needing a bigger, more complex solution.
 
  | decebalus1 wrote:
  | Database Internals: A Deep Dive into How Distributed Data
  | Systems Work by Alex Petrov
 
| ChrisArchitect wrote:
| (1995)
 
  | slver wrote:
  | Year programmers have ignored relational theory since:
 
| senthil_rajasek wrote:
| Previous hn post with comments
| 
| https://news.ycombinator.com/item?id=19726520
 
| slyrus wrote:
| Was lucky enough to take Serge's class when he was visiting UC
| Berkeley back in the 90's. And yet I'm still writing SQL :(
| Curious to know what the state in the art in solving the then-
| thornier theoretical problems (negation mostly) is.
 
| tingletech wrote:
| I think this was my textbook when I took a relational database
| class at UCSD Extension back in the 90s.
 
___________________________________________________________________
(page generated 2021-04-14 23:00 UTC)