Building Databases

=== ORM Standards / Best Practices ===

Hibernate / JPA are the “global standard” but Japanese engineers usualy use MyBatis or some other framework that has raw SQL files.

The other team members do not see any issue with not using Hibernate / JPA or other ORM technologies. As long as DAO are being used and identity and singularity are preserved, there is no need to move to something like Hibernate just for the sake of adhering to some perceived “global standard”.

An example story was given of a company with all of its business logic embedded into their database. They had a 1 hour SLA that resulted in significant losses when they were unable to scale the application to meet increased need.

ORM overreach is a serious problem.

Toplink was *way* better than Kodo.

https://en.wikipedia.org/wiki/TopLink

With “true orthogonal persistence”, there are no concerns about identity or uniqueness even when the data is distributed across multiple VMs.

Hibernate tends to cause more trouble than it helps. Examples:

— Hibernate’s default configuration basically prevents good performance

— Hibernate can force you modify your data model to fit its needs.

One of the participants wrote a wrapper for the Spring Framework.

If your data is is tabular data, treat it like tabular data!

“What DB access framework do you guys use?”
— raw JDBC
Spring  (just makes JDBC more ergonomic)
jOOQ (advantage is type checking)

=== Fauna, a new database written in Scala ===

Why another database?
There was a niche to fill for an “adaptive operational database”.

DEF: operational: able to serve data that is immediately needed by day-to-day business (e.g. who is online?, price info for PoS). This can be modeled as a bunch of stateless services.

DEF: adaptive: scalable / flexibility in provisioning / really good ergonomics (not another Cassandra )

But analysis (BI?) is also a requirement!

This new DB is completely “temporal”.

DEF: “temporal”: like GIT or HG, you always have a full audit trail and you can run arbitrary queries against the DB as it was at any specific time in the past. (You *can* throw away data if you want.)

You can stream events from the DB if needed.

other features:

— always replicated (minimum setup is 3 replicas w/ 3 partitions)

— all JSON API

— distributed consistent database

The Temporal nature makes updates concurrently easy but they also need to implement CaS.

http://cs.yale.edu/homes/thomson/publications/calvin-sigmod12.pdf

“I code in Scalla but I am not a Scala developer”

DEF: Scala developer: rabid Scala fan-boy/girl

So why Scala ?
answer: one of the founders (Matt) is a “Scala developer”.

“We do have a query monad.” (because of *course* they do…)

Using Scala macros to write codex.

CBOR (Binary encoding for JSON)

RAFT is like a Paxos that is understandable.

Recruiting: More than a particular language paradigm, they want people who really know the JVM.

Serviceability issues: Heavy use of continuation passing style in Scala results in a heap full of pending futures. Thread dumps not very helpful.