Rails Conf 2018 Day 2 April 18, 2018

I visited Rails Conf and Pittsburgh this week and I wanted to post my notes from the talks I went to. Here are the notes and takeaways from day two.

Keynote: The Future of Rails 6: Scalable by Default

Speaker: Eileen M. Uchitelle

Notes

Rails is scalable out of the box up to a point. Once Rails gets to Github size though there are some workarounds and optimizations most teams do. These optimizations are going to be built in to Rails 6.

Some Rails 6 scalable by default improvements:

When companies try to solve problems they often only look inward. Eileen encouraged developers to think about building general solutions and open sourcing them to help the community. Scalablility problems face a lot of people. This encouraged me to reinitiate open sourcing a project that addresses one issue of scaling.

Takeaways


So You’ve Got Yourself a Kafka: Event-Powered Rails Services

Speaker: Stella Cotton

Notes

Kafka is used to stream data for data pipelines and event driven applications and pipelines. Kafka guarantees at least once delivery within partitions. Think of a Kafka partition as a long log file with indexes and guarantees of ordering, a Kafka cluster is made up of many partitions. Applications will have multiple partitions but you should put related events on the same partition for ordering. Kafka improves speed and indepence over using RPC for distributed events processing but it does remove the explicit dependencies.

Martin Fowler defines four types of event systems in his blog post: What do you mean by “Event-Driven”?

  1. Event created: Only send an event an id. Downstream services will call sender if they need more information.
  2. Event + Information: Event with id, event and state changed information. No reliance on calling the sender, a downstream server has all it needs.
  3. Event Sourcing: All state changes are admitted as events and you can rebuild the state of the world from replaying all events.
  4. Command Query Responsibility Segregation(CQRS): Split events into read and write. Good reference here: CQRS

Suggested using Avro for Kafka schemas. Scaling can be as easy as adding more consumers but there needs to be metrics on latency. Consumers that are slower than the requests coming in or paused for a while may have a hard time catching up.

Takeaways


Postgres 10, Performance, and You

Speaker: Gabe Enslein

Gabe is on the Heroku Postgres team and summarized some of the cool things coming to Postgres 10.

Notes

Takeaways


Five Sharding Data Models and Which is Right

Speaker: Craig Kerstiens

Notes

What is sharding? Sharding is seperatinga large database into smaller faster databases. Tables on different nodes allow performance gains. Tips for sharding:

Their are five general ways to colate data:

  1. Geography: For when there is clear geographic boundaries. Example Uber, Instacart
  2. Multitenant: Each customer has their own shard. Will not work well if one customer takes up disproportianate amount of database, >10% may mean sharding doesn’t help much.
  3. Entity ID: sharding on an id if there aren’t joins that are needed. Best for aggregations.
  4. Graph Database: shard on a few relation types and replicate duplicate data. Check out paper TAO distributed graph datastore.
  5. Time Series: event data and metrics can be sharded by a time period. Works best when dropping older data.

Takeaways


Ales on Rails: Making a Smarter Brewery with Ruby

Speaker: Ben Shippee

This was more of a fun talk without many takeaways but fun to see someone building a cool home built system for managing a brewery.

Notes

Rails application was custom built for Brew Gentleman. The goals were to make managing the brewery easier. Features that they built to automate the brewery:

Takeaways


Containerizing Rails: Techniques, Pitfalls, & Best Practices

Speaker: Daniel Azuma

Blog post from speaker of the talk here: Containerizing Rails: Techniques, Pitfalls, and Best Practices (RailsConf 2018)

Notes

Tips for containerizing your application:

  1. Read and understand the base image.
  2. Combine update, install and clean commands in one run line to prevent bloat of image.
  3. Use multi stage Docker files to have an image for building dependencies and then copy over the built app without development dependencies.
  4. Set locale in the Dockerfile to potentially avoid some weird Ruby string errors.
  5. Run your app under an unprivileged user still!
  6. Prefer the exec form: CMD ["bundle", "exec", "rails", "s"]. This ensures that the stop signals are sent to the program and not the shell.
  7. Can get around 6 with by prefacing cmd with exec.
  8. Avoid using onbuild because it makes assumptions about how image is used.
  9. Always specify resource constraints to help Kubernete’s plan workload.
  10. Avoid preforking in a container, instead have one process per container.
  11. Scale by adding containers.
  12. Send logs outside the container, either an agent or standard out.

Takeaways