3620 stories
·
3 followers

The Internet

1 Share

The Internet

And more from the Internet.

Read the whole story
emrox
2 days ago
reply
Hamburg, Germany
Share this story
Delete

Database Design for Google Calendar: a tutorial

1 Share

Author: Alexey Makhotkin squadette@gmail.com.

Introduction

In this database design tutorial (~9000 words) I’m going to show how to design the database tables for a real-world project of substantial complexity.

We’ll design a clone of Google Calendar.  We will model as much as possible of the functionality that is directly related to the calendar.

This series illustrates an approach explained in the book called “Database Design using Minimal Modeling”, scheduled to be released in Summer 2024.  Here is the website of the book: https://databasedesignbook.com/.  You can leave your email address to receive occasional updates on the book and related materials.

We will first build a complete logical model that describes the calendar data to be stored.  This should take most of the effort (~80% of text by word count). After the logical model is finished, we’ll build table design directly based on the logical model.

Intended audience

The goal of the book is to help you get from a vague idea of what you need to implement (e.g.: “I need to build a website to manage schedule and instructor appointments for our gym”), to the full and complete definition of database tables.

The first three quarters of the text require only a general understanding of what databases are and how information is stored there.  Parts 1-6 only talk about logical models, that do not depend on the specific database that you use (MySQL, Postgres, other classic relational servers, NoSQL solutions, cloud databases, etc.).  So the majority of the text describes how the business requirements are modeled.

The last quarter of the text shows how to get from the logical model to physical tables structure.  This part is absolutely not comprehensive, it only shows one of many possible approaches to the database table design.  This approach, however, is perfectly valid for not very demanding systems.  Also, many existing systems are designed in part using this strategy.   This part of the text requires more familiarity with common databases: how the tables are created, what physical data types exist, what is primary key and index, how the tables would be queried, and how to insert and update data.

Table of Contents

Introduction

Approach of this book

Problem description

Part 1: Basic all-day events

Part 2: Time-based events

Part 3. Repeated all-day events

Part 4. Rendering the calendar page

Part 5. Rendering the calendar page: time-based events.

Part 6. Complete logical model so far

Part 7. Creating SQL tables

Finally: the tables

Conclusion

What’s next?

Approach of this book

Often people start with designing the tables right away, but we take a different approach.  This tutorial is aimed at people who are new to database design.  The goal of the process is to answer several important questions:

  • Where to begin? 

  • How to make sure that we did not miss anything?

  • how to ask for feedback on our database design;

  • how to fix design mistakes;

We begin with a logical model, written in a simple tabular format.  We use short formalized sentences to define data attributes and relationships between entities.  This helps us to make sure that our logical model aligns with the actual business requirements.   Logical model is independent from a specific database implementation.

As the second step, when the logical model is decided, we design the physical tables.  This process is very straightforward.  For each element of the logical model there would be a corresponding table or column.  Physical models can be as dependent on a specific database implementation as you need.

Problem description

We’re going to implement a big part of Google Calendar functionality.  Some parts we’ll skip, but we’ll try and implement every feature of calendaring.  Some areas we’ll implement just enough to be able to discuss the more interesting parts.  In the end you will be able to add missing functionality to the schema, going through the same process.

Google Calendar is a multi-user system.  For example, users can share the events with other people.  We’re going to implement only a bare minimum of user-related data.

Events are the central part of Google Calendar, and we’re going to design them as closely as possible to the real thing.  Events have title and description, as well as some other minor attributes such as location.

The most complex part of calendar events is times and dates:

  • “All day” events vs time-based;

  • Both can be repeated and non-repeated;

  • All day events:

    • Can spread over multiple days;
  • Time-based events:

    • Can have associated time zone;

    • Have begin and end time;

    • Begin and end time can happen on different days;

    • Begin and end time can be in different timezones;

  • Both all-day and time based events:

    • Can be repeated daily, or every N days;

    • Can be repeated weekly, on certain days of the week; again, it can be every two or more weeks;

    • Can be repeated monthly, on a certain day or day of the week;

    • Can be repeated annually;

    • Repeating events can go on forever, until a certain date, or for a certain number of repetitions;

For the repeating events, the specific instances can be moved to a different date/time.  You can also delete some instances of the repeating events, for example, skipping a certain weekly meeting.

You can change the schedule of the repeated event, even if some of the events already happeneds.  For example, you can switch from two project meetings every week on Tuesday and Thursday to one meeting every two weeks, on Fridays.

Here is a screenshot of a day event editing form:

Part 1: Basic all-day events

Anchors

First thing we need is to find some so-called anchors.  Anchors are also known as entities.  Anchors are usually nouns, such as User and Event.

Anchors are something that can be counted.  “No users”, “one user”, “two users”, etc.  Also, one defining characteristic of an anchor is that it can be added: “adds a user record to the database”.

Anchors are extremely obvious in simple cases, but may get tricky in non-obvious cases.  We’re going to write down even the most obvious anchors, to get some experience with handling them.

The first two anchors that come to mind are:

Anchor Physical table
User
DayEvent

Basically the only thing that anchors handle is IDs and counting.  All the data is handled by attributes, discussed in the next section.

So, for example, in our database tables there would be a User with, say, ID=23, and DayEvent with ID=100, etc.

We won’t be dealing with the last column (“Physical table”) for now, we’ll discuss the physical model in the “Creating SQL Tables” section below.

To validate that we have an anchor, we can write a sentence using the name of this anchor.  If this sentence makes sense, then this is an anchor.  Example sentences:

  • There are 200 Users in our database”;

  • When this form is submitted, a new User is added to the database”;

Same for DayEvent:

  • There are 3000 DayEvents in our database”;

  • When this button is clicked, a new DayEvent is created”;

Such sentences will be useful later in more complicated cases.

If the sentence does not make sense, then it may be an attribute:

  • There are 400 Prices in our database” (???);

  • When this form is submitted, a new Price is added to the database” (???).

Attributes of User

Attributes store the actual information about anchors.

Which data about users should we model?  Users are ubiquitous, and different systems may want to store a lot of information about users.  For this post, we’re going to model just the bare minimum of user data: emails.

Anchor Question Logical type Example value Physical column Physical type
User What is the email of this User? string “cjdate@example.org”

What can we see here?

  • This attribute belongs to the User anchor that was defined in the previous section;

  • We use questions to describe all sorts of attributes.  Later on we’ll discuss why we prefer this style over “User’s email” and such;

  • Logical type is quite simple.  If you expected to see “VARCHAR(128)” here or something like that: no, we’ll discuss this much later.

  • We show an example value that helps us to confirm our thinking.  Again, in simple cases this is very obvious, but it would help reviewers to confirm that everyone is on the same page.

  • We won’t be dealing with two last columns for now, we’ll discuss the physical model later in this post.

We’re going to see more examples of logical types later. We extensively discuss the logical types in the book.

Let’s skip ahead a little bit, and show how the table data about the users looks like.  We’ll be using a simple strategy to design physical tables, so the result is going to be completely unsurprising:

Table users
id email
2 “cjdate@example.org”
3 “someone@else.com”

This is just to show the part of the final result so that you know where we’re heading to.  Here, a user with ID=2 has email “<a href="mailto:cjdate@example.org">cjdate@example.org</a>”, and a user with ID=3 has email “<a href="mailto:someone@else.com">someone@else.com</a>”.

Except for this, we won’t be talking much about the users here.

Attributes of DayEvent

Suppose that we want to schedule a two-day company retreat that begins on January 14th, 2024.  In terms of anchors, this is going to be a DayEvent.

Looking at the paragraph above, we can see that we need to store the following data about the DayEvents:

  • Name of the event;

  • Begin date and end date of the event.

Let’s write that down in our table:

Anchor Question Logical type Example value Physical column Physical type
DayEvent What is the name of this DayEvent? string “Company retreat”
DayEvent When does the DayEvent begin? date 2024-01-14
DayEvent When does the DayEvent end? date 2024-01-15

What can we see here?

  • We defined our first three attributes;

  • We don’t have any short names for the attributes, and this may bother us a little.  We’d expect to have something like “DayEvent_name” or some other identification to refer to the attribute later in the text.  We’ll return to this topic later.

  • We have a new logical type: date. We won’t need to deal with timezones in this section.

  • For most events in actual calendars the begin date and end date would probably be the same (most events are single-day).  We’ll just store the same date in both attributes.  This allows us to treat the special case (single-day events) as the general case (multi-day events).  This is a general design strategy, but we’re going to investigate later if this line of thinking is always applicable.

Links

Where do we store the information that this particular user has created this particular DayEvent?  At the first glance, this may look like an attribute of DayEvent, right?  (Actually, no).

Anchor Question Logical type Example value Physical column Physical type
DayEvent Which User has created this DayEvent??? number??? 2???

Attributes cannot contain IDs.  Instead, when two anchors are involved, we need to use links.

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
User < DayEvent User creates several DayEvents DayEvent is created by only one User 1:N

Links help to pin down the correct design for many complicated scenarios.  To describe the link, you need to write down two sentences.  If the sentences don’t make sense or do not match the reality of the business — you’ve just prevented a design mistake!

What do we see in this table?

  • Link connects two anchors.  Anchors could be different, like here, or the same. For example, “Employee is a manager of Employee” is one example of one such link.

  • To specify cardinality, we use a custom notation with the “<” character (other possibilities are: “”, “=”, and “>”).  It is the same as “1:N”, but additionally it allows you to specify which anchor is “only one” and which is “several”.

  • We use two formalized sentences that involve two anchors, a verb, and information about cardinality.  Those sentences allow us to validate and document our design.

  • We write down the cardinality again, in a more familiar way in a separate column.  Pinning down cardinality is essential, so we make a lot of fuss about it.

(We’ll discuss the links much more, of course).

A peek into the physical model

Again skipping ahead: if we would have stopped right now, and tried to write down the physical design for the schema that we have so far, here is what we’d see.  This is just to confirm that we’re heading in a familiar direction.

Table: day_events
id name begin_date end_date user_id
20 “Company retreat” 2024-01-14 2024-01-15 3

We’ll discuss the physical model later in the “Creating SQL Tables” section.

Part 2: Time-based events

In the previous section, we discussed basic non-repeating date-based events.  Let’s see how our modeling approach handles time-based events.

We modeled date-based events as the DayEvent anchor with the following attributes:

  • What is the name of this DayEvent?;

  • When does the DayEvent begin?

  • When does the DayEvent end?

Also, we defined the link between User and DayEvent: “User creates several DayEvents”.

Let’s write down a quick draft of time-based events and see how it compares to date-based events.  Quoting the “problem description” section:

  • “Time-based events:

    • Can have associated time zone;

    • Have begin and end time;

    • Begin and end time can happen on different days;

    • Begin and end time can be in different timezones;”

Time zones

Every country and territory uses one or more time zones. Time zone definitions occasionally change.  Each country, being a sovereign state, can decide to change their time zone definition.

Time zones may use Daylight Savings Time, or can be uniform.  New time zones may be introduced, or retired.  In this text, we won’t go into complications of handling time zone definitions.  If you were really implementing a serious global calendaring solution, you’d probably have a separate team dealing with such issues.

However, in this tutorial we will implement full timezone-aware events that are usable in practice.

We have one motivating example: plane tickets.  Planes often cross time zone boundaries, and the take-off and landing times in your ticket would be in different time zones. Say, there is a flight from Amsterdam to London that departs on December 24, at 16:50 (Amsterdam time) and lands at 17:05 (London time).  So, the flight duration is 1 hour 15 minutes.

Time zones inspire a lot of programming folklore.  There are many blog posts, horror stories, “things every programmer should know” and other texts related to time zones, particularly in database context.  Also, many systems keep breaking in various ways around the daylight switch time.  We will discuss only as much as needed for our purposes, and briefly mention some important things to consider.

Anchors

Having said that, it seems that we need to add two anchors:

Anchor Physical table
Timezone
TimeEvent

There are dozens of time zones in the world. We can confirm the validity of the Timezone anchor by writing down example sentences:

  • There are 120 Timezones in our database”;

  • When this import script finishes, a new Timezone is added to our database”.

(Timezone data structure is discussed below.)

The sentences for TimeEvent are also straightforward:

  • There are 2500 TimeEvents in our database”;

  • When this button is clicked, a new TimeEvent is created”;

Attributes of Timezone

For the purposes of this text, we’ll do only a very minimal model of Timezone.  Basically, the only attribute we’d introduce is:

Anchor Question Logical type Example value Physical column Physical type
Timezone What is the human-readable name of this Timezone? string “Europe/Kyiv”

We won’t go into details of how the timezone is actually defined.  We assume that there is a complementary logical model that describes the structure of timezones.  Also, we assume that there is some function that takes a local time in the specified timezone and returns UTC time (and vice-versa).  This will be discussed in more detail in the next section, when we will talk about repeating events.

For clarity, here is what else would be included into a time zone definition:

  • What is the UTC offset of this time zone?

  • Does this time zone have Daylight Savings Time?

  • When does DST begin?  When does DST end?

  • What is the UTC offset when DST is on?

  • We’d also need to model previous definitions of a time zone.  For example, the government may decide to change the day when DST goes into force, or get rid of DST, etc.

  • Is this time zone active or retired?

This is an incomplete list.  Modeling all of this data using our approach is possible, but is a separate, and quite technical exercise.

Let’s get back to events.

Attributes of TimeEvent

Anchor Question Logical type Example value Physical column Physical type
TimeEvent What is the name of this TimeEvent? string “Catch-up meeting”
TimeEvent When does the TimeEvent begin? date/time (local) 2024-01-14 12:30
TimeEvent When does the TimeEvent end? date/time (local) 2024-01-14 13:15

Note that we’re using local time here.  You may have read that time should be stored in UTC time (without any time zones), and then formatted for human readability using a preferred time zone.

Here we have a different situation.  Time zones can change.  Suppose that we scheduled a billiard game on September 6, 2058, from 09:30 to 11:00, Cologne time.  At the moment we don’t know what UTC offset is going to be in that time zone at that time.  So we must store the data exactly as entered by the user, and then adjust it as the local legislation changes.

Links

We have two very similar links here.

Anchor1* Anchor2 Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
Timezone < TimeEvent Timezone is used for the start time of many TimeEvents TimeEvent uses only one Timezone for the start time 1:N
Timezone < TimeEvent Timezone is used for the end time of many TimeEvents TimeEvent uses only one Timezone for the end time 1:N
User < TimeEvent User creates several TimeEvents TimeEvent is created by only one User 1:N

The definitions of both links differ only in a single word (“start” vs “end”).

Most of the time-based events would have the same time zone for both start and end times.  We design this as a general case: we always specify both, even if they are the same. This approach would help us to get used to handling more complicated cases.

Similarities between DateEvent and TimeEvent

All-day events and time-based events look similar. Does it make sense to think about unification?

For example, both events have names.  Also, how different are “date” and “date with time” from each other? We can also observe that both types of events would have more common data, such as “location”, list of invited guests, “description”, etc.  Maybe we could extract that to some component shared between those two anchors?

Thankfully, logical modeling allows us to wait a little bit before making that decision.  After all, it’s just simple tables that we can reshuffle before we commit to a physical table implementation.

For now, though, we want to gather more information on similarities (and, more important, dissimilarities) of two types of events that we’ve seen so far.  Also, we want to see if there would be more types of events, and what attributes and links they have.

Part 3. Repeated all-day events

As we may remember from the initial problem description:

“Both all-day and time based events:

  • Can be repeated daily, or every N days;

  • Can be repeated weekly, on certain days of the week; again, it can be every two or more weeks;

  • Can be repeated monthly, on a certain day or day of the week;

  • Can be repeated annually;

  • Repeating events can go on either forever, or until a certain date, or for a certain number of repetitions;”

In this section we’ll only talk about all-day events. Later we’ll see how the Minimal Modeling method handles commonalities between different anchors, in this case time-based events.  Also, we’ll see how the logical schema gets changed: we would use this as an example of how the draft design is edited when a better design approach is introduced.  Remember that all of this happens before we even start thinking about database tables, so this is a very smooth process: you don’t need to worry about table migrations yet.

If you think about the bullet points listed above, your reaction may be: “we probably need JSON for that”.  That may well be true, but JSON belongs to the physical table design, so we won’t discuss this at this point.  We’ll design everything that is needed on a logical level, and later on we’ll see what physical options we have.

Attribute #1, cadence

So, let’s ask the first question, hoping that it would help us find an attribute: “How often is that event repeated?”  Looking at event editing form we can see possible answers to that question: a) never; b) daily; c) weekly; d) monthly; e) annually.

We say that such attributes have an “either/or/or” type.  Let’s write it down as an attribute.

Anchor Question Logical type Example value Physical column Physical type
DayEvent How often is that DayEvent repeated? either/or/or daily weekly monthly annually

For either/or/or attributes, we show the entire list of possible values in the “Example value” column.

What do we do with never-repeating events?  On the logical level, an attribute can either be set to a specific value, or unset: this is a basic principle of minimal modeling.  So if this attribute is not set then the event would be non-repeating.

Attribute #2, tangled attributes

Reading further, we see that for all four frequencies, there is an additional possibility.  The events can repeat every N days, N weeks, N months, and N years.  Let’s write that down.

Anchor Question Logical type Example value Physical column Physical type
DayEvent For repeated events: what is the repetition step? integer 2 (every two days/weeks/etc)

This is our first example of tangled attributes.  Its value only makes sense when another attribute is set.  We specify that by adding “For repeated events: ” as part of the question.

Note that this is only a human-readable notation, we’re not going to discuss here how to write a machine-readable logical schema.

Attribute #3

When you specify a monthly event, you have two options: repeat on the same day of the month (say, every 16th of the month), or repeat on the same weekday of the month as the original date (say, every second Tuesday).  The base date is taken from the normal “When does the DayEvent begin?” attribute that was discussed in the beginning.

Let’s try and define this attribute:

Anchor Question Logical type Example value Physical column Physical type
DayEvent For monthly repeated events: which day of the month does it fall on? either/or/or same_day
same_weekday

Days of the week: micro-anchors

Events can be repeated weekly, on certain days of the week.  So, for example we can have a weekly event that happens on Mondays, Wednesdays and Fridays.  Where do we store this?

Let’s start with an attribute that may look like this:

Anchor Question Logical type Example value Physical column Physical type
DayEvent For weekly repeated events: which days of the week does it fall on? Array of strings??? [“Mon”, “Wed”, “Fri”] ???

Would it work? Modern database systems such as Postgres and MySQL natively support storing arrays: Postgres has an array type, also they both have JSON type, so you can store an entire array in a single table column.  Even though we discuss the logical level, this parallel can help to at least confirm the plausibility of this approach.  Also, it’s entirely possible that we’ll decide to store that information in exactly this way, when we will discuss the physical table schema.

However, the approach taken by Minimal Modeling demands that we introduce a new anchor called DayOfTheWeek.

Anchor Physical table ID example
DayOfTheWeek “Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, “Sun”

This is an example of a well-known anchor, similar to currencies, languages and countries.  We discuss well-known anchors in https://minimalmodeling.substack.com/p/currencies-countries-and-languages.

DayOfTheWeek may look a bit awkward because it is so tiny. There are only seven possible IDs here, and new ones will never be added. Also, the IDs are unusual because they are strings, and not the commonly used integer numbers. But introducing it helps keep the modeling approach simple.

You can also introduce attributes for this anchor, such as human-readable names of the days of the week.  This is left as an exercise for the reader.

Now back to our task.  We have two anchors: DayEvent and DayOfTheWeek.  To connect them, we need a link.

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
DayEvent = DayOfTheWeek For weekly repeated DayEvents: DayEvent may happen on several DaysOfTheWeek DayOfTheWeek can contain several DayEvents M:N

This is a pretty normal link, but tangled, like the attributes we’ve seen above.  This link only makes sense when it corresponds to the DayEvent that is repeated weekly. 

Are we done?

How can we make sure that we’re done with modeling?  If we are not 

Let’s revisit the original requirements again, and highlight the parts that we’ve covered so far:

Both all-day and time based events:

  • Can be repeated daily, or every N days;

  • Can be repeated weekly, on certain days of the week; again, it can be every two or more weeks;

  • Can be repeated monthly, on a certain day or day of the week;

  • Can be repeated annually;

  • Repeating events can go on forever, until a certain date, or for a certain number of repetitions;"

Okay, we can now see that we forgot about the number of event repetitions.  The corresponding part of requirements is not marked in any way. Let’s fix this.

Repeat limit: more tangled attributes

“For how long do the periodic events repeat?” This looks like a plausible sentence to define the attribute.  There are three possible answers: “forever”, “until a certain date”, and “for a certain number of reps”.

Anchor Question Logical type Example value Physical column Physical type
DayEvent For repeated events: for how long does the DayEvent repeat? either/or/or forever
until_date
N_repetitions

Now we can add two final missing pieces: which date, and how many repetitions.

Anchor Question Logical type Example value Physical column Physical type
DayEvent For events repeated until a certain date: what is the date? date 2024-01-17
DayEvent For events repeated for a certain number of reps: how many reps? integer 10

We’re done.

Let’s quickly summarize the pieces of data that we defined here:

  • anchor: DayEvent

    • attribute: How often is it repeated?

      • attribute: (for monthly) Which day does it fall on?

      • link: (for weekly) falls on certain DaysOfTheWeek;

    • attribute: What is the repetition step?

    • attribute: For how long is it repeated?

      • attribute: (until a certain date) When does the repetition end?

      • attribute: (for number of reps) How many times is it repeated?

  • anchor: DayOfTheWeek

One extra anchor; six attributes, some of them tangled; one link.

Part 4. Rendering the calendar page

So far we’ve discussed the book-keeping part of the calendar. We have only one record for the entire series of events.  Ten weekly project status meetings correspond to a single database record.  Unlimited number of birthdays of our friend correspond to a single database record.

Let’s get back to the application that we’re working on: a calendar.  We need to show a weekly view of the user’s calendar: say, seven days starting from 26th of Feb up to 3rd of Mar.  Which events do we need to show on that page?  Say, there is one of the weekly project status meetings (out of ten) happening that week. If the birthday (annual event) falls on that week we need to show it.

So, we need to write some sort of SQL query that looks like this pseudo-code:

SELECT …
FROM …
WHERE <date> BETWEEN ‘2024-02-26’ AND ‘2024-03-26’;

Maybe this would even be several SQL queries, or even some code in a programming language.  The data structure that we’ve considered so far is quite complicated.  To find the events that must be shown on a certain week, you need to take a lot into account. This may quickly become impractical.

A note on tempo

I’ve been writing this chapter in the course of a few months.  I’ve been thinking about this problem a lot.  I have a quite clear understanding of the end state that I have in mind,  I just need to write it down.

But if I were to present to you the complete table design right now, it wouldn’t be useful for our goal: to learn database design.  You won’t understand why I made certain decisions.

At the same time, I don’t want to present very small incremental changes, so that text is not too long.  So we need to find some middle ground.

We started this section with a question of how to render a weekly page.  Let’s remember another requirement that our calendar application definitely has: changing and canceling some events from the series. Say, you have ten weekly project meetings, but you want to cancel one of them because the weather is very good. The existing book-keeping part of our database model won’t change.  But we need to add some anchors, attributes and links for the second half: rendering and modification.

General idea

We want to introduce a new anchor that would store the information about each event in the series.  So, if we have 10 weekly project status meetings, we’re going to have ten rows in some table.  Each record would correspond to a specific date (e.g., 2024-02-12, 2024-02-19, etc.).

First, this would make our rendering very simple.  You have a very easy way to find all the events that fall on a specific day.

Second, this would allow us to reschedule and cancel some events in the series.  If we have a project meeting at 12:00 normally, but on a certain week we want to move it to 14:00 (or even to a different day), we can do that. Data in the original book-keeping anchor, TimeEvent, that we defined previously, won’t change.

Also, if we just want to skip one project status meeting, we can mark this particular day as skipped.

Day slots

First, we have to find a name for those things.  In some cases this may be a challenge.  Just five minutes ago, having another cup of tea, I realized that a good word for this thing is “slot”.

Also, like before, we’re going to treat per-day and time-based events differently.  So, we’re going to have DaySlot and TimeSlot anchors.  Let’s discuss per-day slots first.

Anchor Physical table
DaySlot

DaySlot needs a surprisingly small number of attributes:

Anchor Question Logical type Example value Physical column Physical type
DaySlot On which day does this DaySlot happen? date 2024-02-12
DaySlot Is this DaySlot skipped? yes/no yes

Note that the user can change the date for a specific slot!  So, we see that this requirement is handled cleanly.

Also, we need to establish a link between DaySlot and the corresponding DayEvent.

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
DayEvent < DaySlot DayEvent may generate several DaySlots DaySlot corresponds to only one DayEvent 1:N

Some things to note: first, we always create a DaySlot for every DayEvent, even for non-repeating ones.  This is needed to simplify the rendering code.

Second, we have an interesting problem with “infinite” events.  Suppose that we added our friend’s birthday to the calendar, repeating annually.  How many corresponding DaySlots do we need to create?  One possible solution would be to choose some arbitrary limit such as 100 years in the future, and create all the slots for that.  Other solutions are possible, such as on-demand creation when user requests to show a calendar page in some distant future.

Third, if you think about this, it’s possible that more information could also be different for each slot.  For example, it’s possible that some meetings will take place in different locations.  Also, the guest list may change: you can invite extra people to a certain meeting.  Also, the attendance would certainly be different.  We won’t cover those aspects here, because modeling this it’s pretty straightforward: just add more links that connect DaySlot with other anchors.

Also, one thing to consider is that date arithmetics needs a bit of care.  How do we deal with birthdays of people who were born on February 29?  We will have to decide something.  Maybe we’ll prohibit the user from creating such events?  Maybe we’ll ask them where to move the slot: day earlier or day later?  Similar problem also exists for monthly events that happen on the 31st day of the month.

Exercise: TimeSlots

Here is an exercise for the determined reader.  Think about how you would model the TimeSlot anchor, its attributes and links.  Fill in the tables with anchors, links and attributes, using the format explained above.  Think about what role would time zones play.

How far ahead do you need to think?

Sometimes you can create a better design if you consider a slightly broader set of requirements.  We did it in this chapter: we started thinking about rendering the page, but then we also considered the need to modify some events in the series.

Sometimes you can create a better design if you consider requirements independently.  In the previous chapters, we looked at time-based and per-day events, and decided to handle them separately for now.

So far we did not even mention any hypothetical future requirements.  We’ve only been designing stuff that we know is needed.  Can we think up something that would be nice to have in the future, and design ahead? If you only include known requirements, there is a chance of overfitting design for known data points.

At the same time, people sometimes introduce considerations that never actually materialize.  In this case design may introduce extra complexity that is not needed for the actual requirements, adding a bit of friction.

There is always a chance of leaning towards one of those traps, or even falling into them.

Logical design based on Minimal Modeling insists on modeling only the parts that we actually know are needed.  We can afford that because Minimal Modeling treats changing requirements as a given.  Also, Minimal Modeling discourages you from adding more abstract concepts, and this is intentional.

We’ll discuss this aspect of physical design later in the book. We’ll introduce the concept of Game of Tables, and the idea of Date’s Demon.

Part 5. Rendering the calendar page: time-based events.

For repeated time-based events, we choose the same approach as with all-day events.  We’re going to introduce an anchor called “TimeSlot”.  TimeSlot corresponds to a specific event on a specific date and time.  A repeated event corresponds to several TimeSlots.

Time slots can be rescheduled or canceled manually, just like all-day slots.

Here is an anchor:

Anchor Physical table
TimeSlot

And the attributes:

Anchor Question Logical type Example value Physical column Physical type
TimeSlot When does the TimeSlot begin? date/time (local) 2024-01-14 12:30
TimeSlot When does the TimeSlot end? date/time (local) 2024-01-14 13:15
TimeSlot Is this TimeSlot skipped? yes/no yes

A specific time slot can generally be moved even to a different day, so we have to keep this information also.  Which time zone shall we use for the begin/end time?  As you may remember from Part 2, in Google Calendar you can have different time zones for begin and end time.  If you think about that, it makes sense to keep it for the time slots also.

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
Timezone < TimeSlot Timezone is used for the start time of many TimeSlots TimeSlot uses only one Timezone for the start time 1:N
Timezone < TimeSlot Timezone is used for the end time of many TimeEvents TimeSlot uses only one Timezone for the end time 1:N

Also, we need to connect TimeSlots with TimeEvents, same as we did with DaySlots/DayEvents:

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
TimeEvent < TimeSlot TimeEvent may generate several TimeSlots TimeSlot corresponds to only one TimeEvent 1:N

Same as with DaySlot, we would create a TimeSlot even for non-repeated TimeEvents.

Part 6. Complete logical model so far

Let’s go back and collect everything that we’ve designed so far.

First, the complete list of anchors (7 in total):

Anchor Physical table ID example
User
Timezone
DayEvent
TimeEvent
DayOfTheWeek “Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, “Sun”
DaySlot
TimeSlot

Second, list of attributes (ordered by anchor):

Anchor Question Logical type Example value Physical column Physical type
User What is the email of this User? string “cjdate@example.org”
Timezone What is the human-readable name of this Timezone? string “Europe/Kyiv”
DayEvent What is the name of this DayEvent? string “Company retreat”
DayEvent When does the DayEvent begin? date 2024-01-14
DayEvent When does the DayEvent end? date 2024-01-15
DayEvent How often is that DayEvent repeated? either/or/or daily weekly monthly annually
DayEvent For repeated events: what is the repetition step? integer 2 (every two days/weeks/etc)
DayEvent For monthly repeated events: which day of the month does it fall on? either/or/or same_day
same_weekday
DayEvent For repeated events: for how long does the DayEvent repeat? either/or/or forever
until_date
N_repetitions
DayEvent For events repeated until a certain date: what is the date? date 2024-01-17
DayEvent For events repeated for a certain number of reps: how many reps? integer 10
TimeEvent What is the name of this TimeEvent? string “Catch-up meeting”
TimeEvent When does the TimeEvent begin? date/time (local) 2024-01-14 12:30
TimeEvent When does the TimeEvent end? date/time (local) 2024-01-14 13:15
DaySlot On which day does this DaySlot happen? date 2024-02-12
DaySlot Is this DaySlot skipped? yes/no yes
TimeSlot When does the TimeSlot begin? date/time (local) 2024-01-14 12:30
TimeSlot When does the TimeSlot end? date/time (local) 2024-01-14 13:15
TimeSlot Is this TimeSlot skipped? yes/no no

Third, list of links (in no particular order):

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
User < DayEvent User creates many DayEvents DayEvent is created by only one User 1:N
User < TimeEvent User creates many TimeEvents TimeEvent is created by only one User 1:N
Timezone < TimeEvent Timezone is used for the start time of many TimeEvents TimeEvent uses only one Timezone for the start time 1:N
Timezone < TimeEvent Timezone is used for the end time of many TimeEvents TimeEvent uses only one Timezone for the end time 1:N
DayEvent = DayOfTheWeek For weekly repeated DayEvents: DayEvent may happen on several DaysOfTheWeek DayOfTheWeek can contain several DayEvents M:N
TimeEvent = DayOfTheWeek For weekly repeated TimeEvents: TimeEvent may happen on several DaysOfTheWeek DayOfTheWeek can contain several TimeEvents M:N
DayEvent < DaySlot DayEvent may generate several DaySlots DaySlot corresponds to only one DayEvent 1:N
TimeEvent < TimeSlot TimeEvent may generate several TimeSlots TimeSlot corresponds to only one TimeEvent 1:N
Timezone < TimeSlot Timezone is used for the start time of many TimeSlots TimeSlot uses only one Timezone for the start time 1:N
Timezone < TimeSlot Timezone is used for the end time of many TimeEvents TimeSlot uses only one Timezone for the end time 1:N

Finally, here is a diagram that shows all anchors and links (but not attributes):

Part 7. Creating SQL tables

In the previous chapters we defined the complete logical model, so most of the work is actually already done. The rest is pretty straightforward.

For teaching purposes we’re going to use one specific table design strategy: “one table per anchor”.  It is one of the most common approaches to physical table design. There are several more possible strategies, we’re going to discuss them in a book.

We have 7 anchors, 21 attributes and 10 links so far.  Given that we use “one table per anchor” strategy, we’re going to have 7 + 2 = 9 tables (number of anchors + number of M:N links), and 21 + 8 = 29 columns in total (number of attributes + number of 1:N links).

If our logical design correctly describes the business requirements then the tables will be automatically correct.  We’ll talk about evolving requirements in the book.  Also, we’ll discuss design mistakes and how to fix them.

We’re going to revisit the tables from the previous section, and fill in our choices:

  • For anchors, fill in “Physical table” columns;

  • For each attribute, fill in “Physical column”, and choose the “Physical type”;

  • For each M:N link, choose the name of the physical table;

  • For each 1:N link, fill in the column name in the table that corresponds to the N-side anchor;

Anchors: choose names for tables

Here we just choose a straightforward plural name for each table.

Anchor Physical table ID example
User users
Timezone timezones
DayEvent day_events
TimeEvent time_events
DayOfTheWeek days_of_the_week NB: this table may be virtual, see below “Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”, “Sun”
DaySlot day_slots
TimeSlot time_slots

Some companies or applications enforce different naming conventions (singular, camel case, etc.).  In that case, you would just use the names that comply with the convention.

Attributes: choose the column name and physical type

For the physical column name, we choose some sensible name.  For example:

  • day_events.end_date would be the column name for the “When does the DayEvent begin?” attribute;

  • ​​time_slots.is_skipped would be the column name for the “Is this TimeSlot skipped?” attribute

And so on.  Due to the way relational databases work, you have to choose a very short name.  In many cases this name by itself is not enough to fully explain the meaning of the data.  That’s one of the reasons why we begin with the logical schema, and use longer human-readable questions to define the semantics of the attributes.

For the physical type, we choose a sensible type without much discussion.  This topic is covered extensively in the book. There is also a list of recommended data types for each logical type in the book, and we just use that directly.

If you’re working with an existing system, you may be required to choose an alternative physical data type for the column.  For example, your database server may support a better suited data type; or there could be some engineering guidelines that make you choose a different data type.

We’re going to discuss this variance in the book.  However, full discussion of all physical design concerns is well outside of the scope of any book.  This is the stuff that you spend your career on learning, and it changes as new technologies emerge.

Anchor Question Logical type Example value Physical column Physical type
User What is the email of this User? string “cjdate@example.org” users.email VARCHAR(64) NOT NULL
Timezone What is the human-readable name of this Timezone? string “Europe/Kyiv” timezones.name VARCHAR(64) NOT NULL
DayEvent What is the name of this DayEvent? string “Company retreat” day_events.name VARCHAR(128) NOT NULL
DayEvent When does the DayEvent begin? date 2024-01-14 day_events.begin_date DATE NOT NULL
DayEvent When does the DayEvent end? date 2024-01-15 day_events.end_date DATE NOT NULL
DayEvent How often is that DayEvent repeated? either/or/or daily weekly monthly annually day_events.repeated VARCHAR(24) NULL
DayEvent For repeated events: what is the repetition step? integer 2 (every two days/weeks/etc) day_events.repetition_step INTEGER NULL
DayEvent For monthly repeated events: which day of the month does it fall on? either/or/or same_day
same_weekday
day_events.repeated_monthly_on VARCHAR(24) NULL
DayEvent For repeated events: for how long does the DayEvent repeat? either/or/or forever
until_date
N_repetitions
day_events.repeated_until VARCHAR(24) NULL
DayEvent For events repeated until a certain date: what is the date? date 2024-01-17 day_events.repeated_until_date DATE NULL
DayEvent For events repeated for a certain number of reps: how many reps? integer 10 day_events.repeated_reps INTEGER NULL
TimeEvent What is the name of this TimeEvent? string “Catch-up meeting” time_events.name VARCHAR(128) NOT NULL
TimeEvent When does the TimeEvent begin? date/time (local) 2024-01-14 12:30 time_events.begin_local_time DATETIME NOT NULL
TimeEvent When does the TimeEvent end? date/time (local) 2024-01-14 13:15 time_events.end_local_time DATETIME NOT NULL
DaySlot On which day does this DaySlot happen? date 2024-02-12 day_slots.the_date DATE NOT NULL
DaySlot Is this DaySlot skipped? yes/no yes day_slots.is_skipped TINYINT UNSIGNED NOT NULL DEFAULT 0
TimeSlot When does the TimeSlot begin, in local time? date/time (local) 2024-01-14 12:30 time_slots.begin_local_time DATETIME NOT NULL
TimeSlot When does the TimeSlot end, in local time? date/time (local) 2024-01-14 13:15 time_slots.end_local_time DATETIME NOT NULL
TimeSlot Is this TimeSlot skipped? yes/no yes time_slots.is_skipped TINYINT UNSIGNED NOT NULL DEFAULT 0

For this problem, we’ve used around half of logical attribute types:

  • string: 4 attributes;

  • date: 4 attributes;

  • either/or/or: 3 attributes;

  • integer: 2 attributes;

  • date/time (local): 4 attributes;

  • yes/no: 2 attributes.

You can see that the physical definitions of attributes of the same logical type are almost the same.  The only differences are: a) maximum length of strings; and b) NULL vs NOT NULL.

We choose “NOT NULL” for attributes where the value always needs to be there due to business requirements.  For example, the name of the event, or the start date of the all-day event.  For tangled attributes, we choose nullable physical types (“NULL”).  We discuss nullability in the book, but note that “NULL” only exists in the physical schema.

Just as NULLs, so-called “sentinel values” also do not exist in logical modeling.

1:N Links

For 1:N links, we add a column to the N-side anchor table.  For example:

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
User < DayEvent User creates many DayEvents DayEvent is created by only one User 1:N day_events.user_id
DayEvent < DaySlot DayEvent may generate several DaySlots DaySlot corresponds to only one DayEvent 1:N day_slots.day_event_id

Choosing the column name is usually quite easy.  The only complication could be when there are two and more different links between the same two anchors.  We have that situation with the timezones, and we’ll use two different columns.

M:N links

For M:N links we must use a separate table for each link.  Every such table will have almost identical structure, only the column names would be different.

We only need to find a good name for such a table.  There is no naming method that works in all cases, you will have to try some combinations, looking for readability.  For links this is especially difficult because it’s not clear which of the two anchors is more important and should come first.

Same as with attributes, due to the way relational databases work, the name of the table needs to be quite short.  In many cases the name by itself is not enough to fully explain the meaning of the data.  That’s one of the reasons why we prepare logical schema, and use human-readable sentences to define the semantics of the links.

Anyway, here is the full table of links with the names chosen for the tables and columns (see the last column).

Anchor1
*
Anchor2
Sentences (subject, verb, object, cardinality) Cardinality (1:N, M:N, 1:1) Physical table or column
User < DayEvent User creates many DayEvents DayEvent is created by only one User 1:N day_events.user_id
User < TimeEvent User creates many TimeEvents TimeEvent is created by only one User 1:N time_events.user_id
Timezone < TimeEvent Timezone is used for the start time of many TimeEvents TimeEvent uses only one Timezone for the start time 1:N time_events.start_timezone_id
Timezone < TimeEvent Timezone is used for the end time of many TimeEvents TimeEvent uses only one Timezone for the end time 1:N time_events.end_timezone_id
DayEvent = DayOfTheWeek For weekly repeated DayEvents: DayEvent may happen on several DaysOfTheWeek DayOfTheWeek can contain several DayEvents M:N day_event_dows
TimeEvent = DayOfTheWeek For weekly repeated TimeEvents: TimeEvent may happen on several DaysOfTheWeek DayOfTheWeek can contain several TimeEvents M:N time_event_dows
DayEvent < DaySlot DayEvent may generate several DaySlots DaySlot corresponds to only one DayEvent 1:N day_slots.day_event_id
TimeEvent < TimeSlot TimeEvent may generate several TimeSlots TimeSlot corresponds to only one TimeEvent 1:N time_slots.time_event_id
Timezone < TimeSlot Timezone is used for the start time of many TimeSlots TimeSlot uses only one Timezone for the start time 1:N time_slots.start_timezone_id
Timezone < TimeSlot Timezone is used for the end time of many TimeEvents TimeSlot uses only one Timezone for the end time 1:N time_slots.end_timezone_id

Finally: the tables

As we mentioned in the previous section, we’re going to have 8 (eight) SQL tables: 6 for anchors and 2 for M:N links.  One anchor (DayOfTheWeek) is special, so we don’t create a physical table for that.  We use a very common approach to designing physical tables.  Other approaches are also possible, but this discussion is outside the scope of this post.  So, let’s just write down all the tables, and add all the attributes that we have.

This is a very straightforward and even boring process at this point.

CREATE TABLE users (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  email VARCHAR(64) NOT NULL
);

CREATE TABLE timezones (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(64) NOT NULL
);

CREATE TABLE day_events (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  user_id INTEGER NOT NULL,
  name VARCHAR(128) NOT NULL,
  begin_date DATE NOT NULL,
  end_date DATE NOT NULL,
  repeated VARCHAR(24) NULL,
  repetition_step INTEGER NULL,
  repeated_monthly_on VARCHAR(24) NULL,
  repeated_until VARCHAR(24) NULL,
  repeated_until_date VARCHAR(24) NULL,
  repeated_reps INTEGER NULL
);

CREATE TABLE time_events (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  user_id INTEGER NOT NULL,
  start_timezone_id INTEGER NOT NULL,
  end_timezone_id INTEGER NOT NULL,
  name VARCHAR(128) NOT NULL,
  begin_local_time DATETIME NOT NULL,
  end_local_time DATETIME NOT NULL
);

CREATE TABLE day_slots (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  day_event_id INTEGER NOT NULL,
  the_date DATE NOT NULL,
  is_skipped TINYINT UNSIGNED NOT NULL DEFAULT 0
);

CREATE TABLE time_slots (
  id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  time_event_id INTEGER NOT NULL,
  begin_local_time DATETIME NOT NULL,
  end_local_time DATETIME NOT NULL,
  start_timezone_id INTEGER NOT NULL,
  end_timezone_id INTEGER NOT NULL,
  is_skipped TINYINT UNSIGNED NOT NULL DEFAULT 0
);

CREATE TABLE day_event_dows (
   day_event_id INTEGER NOT NULL,
   day_of_week VARCHAR(3) NOT NULL,
   PRIMARY KEY (day_event_id, day_of_week),
   KEY (day_of_week)
);

CREATE TABLE time_event_dows (
   time_event_id INTEGER NOT NULL,
   day_of_week VARCHAR(3) NOT NULL,
   PRIMARY KEY (time_event_id, day_of_week),
   KEY (day_of_week)
);

Is that really it?  Mostly, yes. Though we need to talk a bit about indexes and about the attributes that we’ve skipped for brevity.

Most experienced database developers would look at the schema above and immediately notice that some “obvious” indexes are missing.  For example, day_events.user_id must certainly be indexed.  Unfortunately, there is no hard and fast rule on what columns (and combinations of columns) need to be indexed.  That depends on how the tables are going to be queried by the application.  The best book about database indexes is called “Use The Index, Luke” (https://use-the-index-luke.com/).  Go read it.  We  will also discuss indexes in a bit more detail in the book.

When we were talking about logical schema (especially in the beginning), we skipped some of the attributes, because they were very much similar to other attributes. For example, we would probably want to add the name of the user, and the column that stores the user’s password hash.  Some of the data elements just don’t add anything new to this text, for example the event location, or the list of invited guests.  As an exercise, you could go ahead and add the elements that we did not discuss, the ones that you’re interested in.  Add a few rows to the catalog tables, fill in the contents of each cell, and then edit the schema definition above to include the missing pieces of data.

Conclusion

Here is a short summary of the process:

  • start with a free-form text that describes the business problem you’re working on;

  • write down the list of anchors, as explained above;  use any collaborative tool, such as Google Docs;

  • write down the list of attributes, as explained above, pay particular attention to the questions;

  • write down the list of links, as explained above, pay particular attention to the sentences, because they help you make sure that you get cardinality right;

  • create a graphical schema based on the logical model, if visual representation helps you think;

  • fill in the physical model: table names, column names, physical data types;

  • write down the SQL schema as a series of CREATE TABLE operators, using information from the previous step;

  • submit the schema to your database server, fix typos, re-submit;

  • share the logical model with your team;

What’s next?

This tutorial is basically a chapter from the upcoming book “Database Design using Minimal Modeling” (https://databasedesignbook.com/).  You can leave your email and I’ll send you updates on the book progress and other materials on database design.

Particularly I’m looking for motivated beta readers for the current book draft.  Contact me if you’re interested (squadette@gmail.com).

If this tutorial helped you in understanding some aspect of database design, I’d love to hear your feedback.

Read the whole story
emrox
3 days ago
reply
Hamburg, Germany
Share this story
Delete

How not to use box shadows

1 Share

cgi image with spheres

So you think you know box shadows?

Box shadows. I love them.

Four years ago I found out my m1 can render a stupid number of these bad boys and so I set out to see just how far you can push them and boy did I. If you are looking for a how to use box shadows to get the look of the latest UX trend, this is not the right article for you. But if you like some janky creativity, stay tuned.

I want to share some of the worst possible things one can do with box shadows all on a single div. Things which shouldn't work at all yet somehow they do. But before getting into that, a question must be answered.

What exactly is a box shadow?

Graphic design 101

A box shadow is a kind of drop shadow. And a drop shadow is a kind of image filter which is particularly popular in graphic design due to how versatile it is at adding an approximation of depth to composition.

The filter takes the raster of an image and shifts the pixels along the x and y axis. It will draw these pixels as a single color behind the source image. This gives an illusion of depth by dropping the outline of an image as a "shadow" into the composition hence the name drop shadow.

We can use the css filter property to see this in action.

div {
  filter: drop-shadow(xOffset yOffset rgba(0, 0, 0, 0.5));
}

There are many different implementations of a drop shadow filter across different tools like photoshop, gimp, figma, and css each having a different set of features. For example css also supports an optional blur value to apply to the drop shadow.

div {
  filter: drop-shadow(xOffset yOffset blurSize rgba(0, 0, 0, 0.5));
}

By layering several drop filters one can easily add interesting depth to a composition.

For example, here are 2 layered drop-shadow filters.

Nifty.

What about box shadows?

The Box Shadow

A box shadow is a form of drop filter with many trade offs. First, the name "Box" has to do with the filter only supporting box shapes. For example, lets try applying it to the previous example.

Notice that the shadow shape is limited to the bounding box of the container and how the shadow can break out of the bounding box. This seems limiting but it comes with a few more features one of which is performance.

It turns out that the majority of user interfaces are made up of boxes. It also turns out that some smart people figured out maths hacks to draw rounded boxes for super cheap which UI peeps love because with this hack boxes can be so round as to appear as circles. And the css box shadow implementation supports this math hack.

This means that designers can be liberal with box shadows rather than relying on prerendered source images bloating download sizes.

This little mixer shows the variety of shapes available. Tap to randomize the color.

This opens up a all kinds of freedom for UI design. Layering these together can produce amazing results. You can play around with a border editor here.

Layering. That is an important word. You can layer or chain many box shadows together on a single div. The above example uses this to set the colors.

function randomizeAndColor(e) {
  randomize(e);
  const spread = Math.random() > 0.8 ? 2 : 0;
  const x1 = Math.floor(3 - Math.random() * 6) / (1 + spread);
  const y1 = Math.floor(3 - Math.random() * 6) / (1 + spread);
  const y2 = 2 + Math.floor(Math.random() * 4);
  const blur2 = 8 + Math.floor(Math.random() * 12);
  e.style.boxShadow = `${x1}px ${y1}px 0px ${spread}px ${getRandomPastelColor()}, 0 ${y2}px ${blur2}px #0006`;
}

Usually, a designer will carefully position squares within other squares with consistent margins, paddings, and typography for optimal accessibility and understandability. Wisely, they further add layered shadows and perhaps a few images to help visually distinguish widget interaction and state.

That is all well and good but what we are really working with is a kind of painting api. We can paint an arbitrary number of squares to the screen optionally applying a blur to them.

I initially explored this with somme minimal art in an earlier write up.

I liked the results.

The config that drives this is pretty simple.

const blocks = [
  [8, "#114d33"],
  [24, "#50bbab"],
  [28, "#fcba94"],
  [10, "#eced26"],
  // ...other characters
];

<MinimalSets blockLists={blocks} />;

Now the natural question I am sure you have and I certainly had was, "can we do more box shadows?" What about blurring or transparency? How do they impact performance?

I whipped up a little visual tool where a giant box shadow is created and set on a div like so.

const computedBoxShadow = points.map(
  ([x, y], i) =>
    `${x}px ${y}px ${getBlur(i, frame)}px ${animatedColor()}`
).join(",")

//...other code

<style>`
  myDiv {
    box-shadow: ${computedBoxShadow};
  }
`</style>

Animation is handled by setting the box shadow string every 300ms and then letting transition: all prop do the animation. This causes some jank and ended up being slower that setting the box shadow on every frame.

The result is an app where you can tap to remix a color palette with a history of the last 10 palettes to the left. Here is an example with 100 box shadows. Tap around.

I noticed that applying a blur slowed down the number you could animate which makes sense. However, using a transparent color significantly slowed down the number that can be drawn too which doesn't make as much sense to me. I'd imagine that with hardware today transparency should be somewhat free. The div size also impacts performance which makes me think there is some software rasterizer involved when things are animated. I could look into the source code of browsers but it would be different depending on the js engine.

However, I found that if I didn't set any transparency or blur, my m1 laptop could draw buckets of box shadows. Like thousands of them.

How to seriously not use box shadows

Ok, many box shadows can be drawn. Now what?

Well we cannot rotate the box shadows but they can be circles and a circle kinda looks like a ball. So what if I made a bunch of balls that could bounce around? And maybe I can "fake" a 3d effect by scaling the size based on a z value. This wouldn't be accurate perspective but would add some 3d depth.

This one is pretty simple. Just a big'ol "gamestate" updated in a requestAnimationFrame and then set a giant box shadow string on div. You can touch somewhere to pull the balls towards you. The balls are contained to a box and will bounce to stay in frame.

Toss a tick function into requestAnimationFrame

const tick = (timestamp: number) => {
  gameState.frame++;
  gameState.deltaTime = Math.min(
    (timestamp - gameState.prevFrameStartTime) / 1000,
    0.1
  );
  gameState.prevFrameStartTime = timestamp;
  update(gameState);
  render(gameState);
  winContext._gameFrame = window.requestAnimationFrame(tick);
};

Updating the simulation isn't complicated but for the sake of brevity I will use a bit of psudocode.

const update: GameFunction = (state) => {
  for (const ball in state) {
    updateBall();
    containBall();
    addFriction();
    if (touched) pullToPoint(touchX, touchY);
  }
};

Now rendering is the interesting part. What is going to be run 60 time a second is the following.

const render: GameFunction = (state) => {
  const boxShadowString = state.balls
    .sort((a, b) => b.z - a.z)
    .map((ball) => {
      const zIndexScale = 1 + ball.z / 30;
      const size = ball.size * zIndexScale;
      const halfSize = (size - state.renderContainerSize) / 2;
      const hcs = state.renderContainerSize / 2;
      return [
        ball.x + hcs,
        "px ",
        ball.y + hcs,
        "px 0 ",
        halfSize,
        "px ",
        ball.color,
      ].join("");
    })
    .join(",");
  const renderEl = document.getElementById("render");
  if (renderEl) {
    renderEl.style.boxShadow = boxShadowString;
  }
};

Sort the balls based on z index and fill an array of box shadows. The size calculation is based off of wanting to have x,y,z represent the center of a ball with a radius of size. The z scale is a hack to have some z "depth" where the size is scaled based on a fixed ratio.

Here are 50 balls. Drag em around and make em bounce on the sides.

The 3d scaling works pretty well to give a little bit of depth even if it is total bs. You can notice that when a ball gets close to the "camera" at a certain point it is no longer a circle. This is because the box shadow div is too small for the scaling method. Increasing the container size fixes this but a larger container means slower performance.

Let's see what happens if the balls can bounce off each other with some good old fashion n^2 collision check. Now, I am only going to reflect the balls velocity on a collision detection which is inaccurate but simple. This is not simulating any real physics interaction. I will also fix the z position to make it 2d so it is easier to see what is happening.

Not very interesting. I think something more accurate physics would look nicer but maybe another time. Adding a phone gyro as input to this could be fun too but again maybe another time.

I reproduced another setup where the balls always try and find their way home to a random starting position. The force of a touch is enough to pull them away however. This give an effect almost like a sponge where you can pull bits off. I can think of ways this could be used for some foam spray in fake fluid sim as part of a game or something. Kind of fun.

I noticed that the fake 3d really comes out in the above example as the balls slowly travel back home. How could the 3d aspect be taken further? Maybe I could draw point clouds with the box shadows as points? I could project points on difference surfaces and then draw the points like some godawful 3d renderer.

I thought a good starting point would be to simply map pixels from a picture as points on a 2d plane. This would also be a good stress test to find out what the upper limit is on number of realtime simulated box shadows. Here is the mapping function.

const pixels = await getImagePixels(
  "/images/starry_night_full.jpg" as any,
  width
);
const dx = window.innerWidth / pixels[0].length;
const dy = window.innerHeight / pixels.length;
for (let y = 0; y < pixels.length; y++) {
  for (let x = 0; x < pixels[0].length; x++) {
    const px = x * dx + dx / 2,
      py = y * dy + dy / 2,
      pz = 60 + Math.random() * 3;
    state.particles.push({
      size: pSize,
      x: px,
      y: py,
      z: pz,
      ox: px,
      oy: py,
      oz: pz,
      dx: Math.random() * 3,
      dy: Math.random() * 3,
      dz: Math.random() * 3,
      color: pixels[y][x],
    });
  }
}

The image is scaled to fit a max width which can be configured in a query param but otherwise it is the same as before. If you want the source, here is the codesandbox.

I started with one of my all time favorite paintings ever. Tap around.

Depending on your device, this demo may be melting it as it is rendering several thousand box shadows in simulated 3d space. You can drag around to kinda explode the image up. This is taking the previous examples and setting the starting positions and colors based on an image.

I am going to bump up the count and rotate the camera. I will record this one to save some battery life. If you want to burn your battery give a live version a try here. You have been warned.

This is promising. I personally love how this has an almost painted style due to the circular pixels looking kinda splatted on. Here is another example with an increased count and some interaction.

You can see it is chugging at this scale. For reference, this is somewhere in the ballpark of 12,000 box shadows. I mean, damn. I wonder if perhaps it is so fast because the m1 has shared gpu and cpu memory? My desktop certainly cannot push this many box shadows neither can my iphone or old android. Crazy result tho.

What about a projecting the points uniformly on to the surface of a mesh?

It turns out with a bit of math it totally works. Here is a cube using a formula with uniform point distribution.

You can tap to interact with it. Kinda reminds me of jello. I also added a small light which follows the mouse positioning. This adds a bit more depth. The light is not accurate at all with magic constants left and right but what is programming without a sprinkling of magic? 😊

.map(([p, coord]) => {
  const zIndexScale = 1 + coord.z / 40;
  const size = p.size * zIndexScale;
  const halfSize = (size - state.renderContainerSize) / 2;
  const hcs = state.renderContainerSize / 2;

  const lightDist = Math.sqrt(dist(coord, lightPos));
  const intensity = (1 - lightDist / 900) * 1; // I have no idea what i was doing here.
  const lumen = Math.min(2, (1 / lightDist ** 2) * 60000);
  return [
    coord.x + hcs,
    "px ",
    coord.y + hcs,
    "px 0 ",
    halfSize,
    "px ",
    darkenHexColor(p.color, lumen),
  ].join("");
})

I used gypity to give me a function to do the cube particle mapping among a few other math helpers. Sometimes gypity-g would work but sometimes it wouldn't and I have to stop being lazy. More on this later...

The first function gyptiy gave was a random distribution which I didn't want. I wanted a uniform placement of points across the surface of a uniform sized cube. It was able to get this on the second try.

export function getCubeSurfacePoints(
  cubeSideLength: number,
  numberOfPoints: number
) {
  const points = new Map<string, Vec3>();
  const halfSideLength = Math.floor(cubeSideLength / 2);
  const facePointSpacing = Math.floor(
    cubeSideLength / Math.sqrt(numberOfPoints)
  );
  const addPoint = (x, y, z) => {
    const key = `${x}, ${y}, ${z}`;
    points.set(key, { x, y, z });
  };

  // Generate points on each face of the cube
  for (let i = -halfSideLength; i <= halfSideLength; i += facePointSpacing) {
    for (let j = -halfSideLength; j <= halfSideLength; j += facePointSpacing) {
      // Front face
      addPoint(i, j, halfSideLength);
      // Back face
      addPoint(i, j, -halfSideLength);
      // Top face
      addPoint(i, halfSideLength, j);
      // Bottom face
      addPoint(i, -halfSideLength, j);
      // Left face
      addPoint(halfSideLength, i, j);
      // Right face
      addPoint(-halfSideLength, i, j);
    }
  }

  // Filter out points that are outside of the cube
  return Array.from(points.values());
}

The idea is pretty simple for a cube. March along all the faces of the cube and fill them based on a given size. This assumes all sides are equal length. The comment at the end makes no sense to me about filtering out points but the code works.

This means we can easily increase the fidelity of the cube by bumping up the points per face. Lets bump it up a bit.

A cube is nice and all but what about other shapes like a sphere?

Well it turns out that uniformly positioning points on a sphere requires some weird math. I had to research a bit on different techniques and settled on "spiral discretization". I have no idea if this is right. I think it is but I don't fully understand the math. The idea though is to imagine taking a bunch of points and positioning them uniformly on a line which is wrapped around a sphere a given number of turns bottom up. Think of a rope wrapping around a ball.

export function spiralDiscretization(
  numPoints: number,
  numTurns: number,
  radius = 1
) {
  const points: Vec3[] = [];
  for (let i = 0; i < numPoints; i++) {
    const t = 1;
    const phi = Math.acos(1 - (2 * i) / (numPoints - 1));
    const theta = (2 * phi * numTurns * t) % (2 * Math.PI);
    const rad = 1 * radius;
    const x = rad * Math.sin(phi) * Math.cos(theta);
    const y = rad * Math.sin(phi) * Math.sin(theta);
    const z = rad * Math.cos(phi);

    points.push({ x, y, z });
  }
  return points;
}

The line and rope analogy gave me the idea to animate the points to some music since the line should map to an array of frequency values pretty easily. This wasn't too hard to do and I like the result.

I also animated a gradient across the points which interpolates over time to new color palettes. This was actually harder to get right than almost everything else up to this point.

This is another example but with fewer turns.

It is a great visual to see how the spirals give an approximation of uniform distribution but breakdown without enough turns. My partner said it looked like a tape worm. I tend to agree especially when the colors are white. :worm: The sound animation could use improvement. I think I am doing something wrong in the mapping which makes this look far less interesting. Still, this shows spheres are possible with the right formula.

Now I'd imagine the next question would be, what about triangles? Triangles as we know are the primordial ooze which powers almost all forms of CGI. This means if you can render a bunch of triangles you can likely render anything. It should even be possible to support textures and UV mapping. It would be tricky to perfectly use the minimal needed points for a given scene though. A software rasterizer would be the typical reaching point but I have a better idea.

Way back when, right after the first Homer Simpson was made with 3 layered box shadows I had a vision. A vision given to me four hours after this video from Sebastian Lague was posted. Two words. Ray tracing. Could I ray trace with box shadows? Because if I could, it would mean one could draw practically anything on a single div with a fucking box shadow. Would it work? Could it? At least on an m1 it should work. I was too scared to try at the time and shifted gears to easier ideas. After many more experiments over a few years, I think now is the time. Time for a box shadow ray tracer.

Please, do NOT do this with box shadows...

unless you drip apple silicone

BEEP BEEP :alert: :alert:

Moving forward be careful running the examples. I mean, I did warn you. You should definitely not do this with box shadows. It really is a terrible terrible idea with absolutely no use case at all. Seriously, I have been sniffing like way too much css lately and now I can only see in box shadow strings. Don't send help. I think it is contagious.

The examples will be low fidelity with images of high fidelity renderings. Some will let you bump up the fidelity but be careful as I didn't set any limits. Time to get to it.

Ray tracers/ray marchers are an accurate but slow way to generate images. They power almost the entire CGI industry. They can be pretty easy to write but hard to optimize. These days most ray tracing is done on GPUs and can be complicated. All we have to work with are box shadows today and I think using the GPU defeats the point a bit (and it is hard) so I am going to stick with a CPU based tracer.

There are many optimizations that can be done depending on what shapes you are tracing. For now I want to render just spheres because the math is easier. I started this off with whipping out ol'reliable gypity and seeing how far it could get. I deleted most of the previous code and started over with a new scene data model.

const gameState = {
  frame: 0,
  prevFrameStartTime: 0,
  deltaTime: 0,
  renderContainerSize: 32,
  cam: new PerspectiveCamera(
    45,
    window.innerWidth / window.innerHeight,
    0.1,
    100
  ),
  spheres: [
    {
      position: new Vector3(0, 1.3, 0),
      radius: 1.3,
      material: CreateMat({ color: new Color(1, 0.2, 0.3) }),
    },
    {
      position: new Vector3(-3, 1.3, 0),
      radius: 1.3,
      material: CreateMat({
        color: new Color(0.9, 0.9, 0.9),
        smoothness: 0.9,
      }),
    },
    {
      position: new Vector3(0, 10.8, 0),
      radius: 3.6,
      material: CreateMat({
        color: new Color(0, 0, 0),
        emissive: new Color(1, 1, 1),
        emissiveStrength: 8,
      }),
    },
  ],
};

const DEFAULT_MATERIAL = {
  color: new Color(1, 1, 1),
  emissive: new Color(0, 0, 0),
  emissiveStrength: 0,
  smoothness: 0,
};

I gave this to gypity with a bunch of technical jargon and let it go with this result.

Not bad at all. It is running in real time which is cool but clearly not all is well. A few issues were an easy fix but some were not so obvious. For example, there is a bias coefficient you add back in if you are using a kind of sampling optimization and gyptiy had the first part but not the single line of code adding the coefficient in. It did have a comment for it. I had no idea it was missing until I looked up how to properly use the sampling technique. I fixed what I could and got this result.

fixed gtp render result

I refactored the code and scene a bit with a progressive rendering setup. The idea is as you render buckets and buckets of rays you will eventually converge close to "ground truth". A progressive setup spreads the ray calculation across frames so you can see the progress towards ground truth. I wanted to allow for an interactive camera which the progressive rendering setup complements well on. I used the camera and orbit controls from the threejs library. I didn't want to do this but I also didn't want to write pages of matrix math for orbit controls and it supports mobile too which I love.

This version can only render spheres. Everything in the scene is a sphere scaled to some degree. Feel free to move the camera around.

This example is running at a tiny fraction of full resolution, only a little over 6% by default. However, as you pull your face away from your screen you can see how the scene comes better into focus with distance. Give it a try. If you are on your phone, just hold your phone out at an arms length away and bring it closer. It is pretty cool. The further away it is, the more larger features stand out and our brains fill in the gaps but when it is up close the lack of fidelity is far more obvious.

const targetW = w * 0.061 * scale;
const unitW = w / targetW;
const targetH = h * 0.061 * scale;
const unitH = h / targetH;
// other code
const scale = Number.parseFloat(params.get("scale") || "1");
const pixelSize = Number.parseFloat(params.get("pixelSize") || "12");
const bounce = Number.parseFloat(params.get("bounce") || "3");
const maxSamples = Number.parseFloat(params.get("samples") || "6000");

With a query param you can bump the resolution scale up a bit among a few other configuration options. What does it look like if we crank some numbers up?

progressive render example scene

Looks promising but it is super slow. Like embarrassingly slow. Let's look at the code and see how it could run faster.

Rendering is standard. Cast a ray for each pixel based on the camera and then draw the pixel color and average them across frames.

let i = 0;
for (let x = 0; x < targetW; x++) {
  for (let y = 0; y < targetH; y++) {
    const u = (x / targetW) * 2 - 1;
    const v = -(y / targetH) * 2 + 1;
    const color = render(spheres, bounce, cam, u, v);

    let p = state.particles[i++] as any;
    if (!p) {
      p = {
        color,
      };
      state.particles.push(p);
    }
    p.size = pixelSize;
    p.x = unitW / 2 + x * unitW;
    p.y = unitH / 2 + y * unitH;
    p.color = color.lerpColors(p.color, color, 1 / gameState.frame);
  }
}

The actual tracer is ugly as I made the mistake of using threejs. You see, threejs loves to create new objects all the damn time. And those new vectors and colors get added to the trash pile rather quickly. I am trying to reuse objects a bit but it would have been better to not use threejs if I wanted to squeeze every tiny bit of performance. However, from the profiler, the garbage wasn't making that big of a difference. Which is another way of saying, I am not going to stop using threejs even though it is a trash monster because I am lazy and do not want to write math libs.

const tColor = new Color();
function trace(ray: Ray, spheres: Array<Sphere>, bounces = 3): THREE.Color {
  const acc = new Color(AMB_COLOR);
  const col = new Color(1, 1, 1);

  for (let i = 0; i <= bounces; i++) {
    const hit = intersectRaySpheres(ray, spheres);
    if (!hit) {
      acc.add(AMB_COLOR);
      break;
    }

    ray.origin = hit.position;
    const diffuse = randomHemisphereDirection(hit.normal)
      .add(hit.normal)
      .normalize();
    const specular = ray.direction.reflect(hit.normal);

    ray.direction = diffuse.lerp(specular, hit.object.material.smoothness);
    tColor
      .set(hit.object.material.emissive)
      .multiplyScalar(hit.object.material.emissiveStrength)
      .multiply(col);
    acc.add(tColor);

    const continueProbability = Math.max(col.r, col.g, col.b);
    if (Math.random() > continueProbability) {
      break;
    }

    col
      .multiply(hit.object.material.color)
      .multiplyScalar(1 / continueProbability);

    if (hit.object.material.emissiveStrength > 0) {
      break;
    }
  }

  return acc;
}

I am going to stay high level here. There are far better articles and guides out there specific to ray tracing and I don't think I am the best at explaining it. It can get deep and math heavy very fast. Here is a good starting places if you are interested.

The general idea is to bounce rays around a scene of objects until a light is hit and then return the computed color based on the object and light properties. Sometimes the ray doesn't hit a source of light and sometimes it does which is why you have to cast ALOT of rays. This tracer is using a pretty simple lighting model. No physically accurate BRDFs here. No textures. No sub surface scattering. Simple diffused with specular reflections.

I got planes working but did make the mistake of bit fiddling with floating point numbers for some reason. Nothing to see here.

render of bad planes

Once that was fixed I could render light squares which make it easier to build well lit scenes. Here is an example of a plane of light outside with no sun.

render of square light with a few spheres

You can tell how rare it is for a ray to hit a light source without stuff to bounce off of. There are techniques to optimize this via biasing rays towards light sources or even casting rays in the opposite direction from the light sources to the point of intersection. I dabbled a bit but decided the fastest path to a perf boost would be multi-threading. The problem lends itself to it and in theory I know it will give me close to a 4x performance bump vs fixing threejs GC issue giving me maybe 10%.

Web Workers

Web workers are how you multi thread in js. Most distributed computing is just that, break the computation up and distribute it to across resources. Once, all the resources are done computing, gather up the results. Ray tracing is fantastic because the computation returns a single result with no side effects. I setup some worker manager code which creates a pool of workers. It has two methods render and an updateScene so we can swap scenes at runtime.

The worker code is a copy and paste of the tracer before with some tedious data marshalling for scene swapping. The post message api is a little awkward but doable. Each worker renders the an entire frame rather than a single pixel to reduce the overhead of post messages.

The result is both exciting and disappointing. You can spin up a full screen version here.

The exciting part is that it is significantly faster than before. Here is one cranked up a bit.

multi threaded cornel box

The downside is that when interacting the screen is black. Why is that?

Well, anytime the camera or scene change all the previously calculated rays are invalid so they are thrown out. This happens on the main thread in the event handler. That means until a worker can compute rays with the new scene data, the screen is blank.

Fixing was a bit tricky to solve at first but the solution is simple. The event handler sets a isDirty flag which is then used in the update loop to know if the frame needs to be wiped or not. This makes things significantly smoother but not perfect.

// fired on input event
const reset = () => {
  gameState.isDirty = true;
  gameState.isDirtyInput = true;
  gameState.lastDirty = Date.now();
};

// in worker render
if (lastFrameTime < state.lastDirty) {
  lastFrameTime = Date.now();
} else {
  if (state.dirty) {
    state.dirty = false;
    state.frame = 1;
  }
  // send worker msgs
}

There is a case where the main thread can get a frame from a worker right after updating scene data. The frame the worker sent is for the previous scene data. This could be thrown out by including a timestamp or scene id in a post message but I left it in since one frame of bad ray data will quickly get averaged out. The goal is to ray trace with box shadows which is dumb. I am ok if the tracing code is a bit dumb too.

The result works well enough though. Here is the original codesandbox if you want the source code.

This is much better. I added another scene too along with some performance stats. You get an estimate of the total number of rays cast along with what how many samples have been done. It stops after 1200 but it is configurable. The rendered resolution is also displayed.

You can also toggle to an alternative lighting model. It is faster but is less accurate due to introducing lighting bias.

garage scene render

Scene data is json so pretty easy to play around with.

garage scene render

This certainly could be improved. Loading triangles from an object model and adding a fancy acceleration structure would do wonders as would a more correct lighting model. But, I am happy with this result. You can in fact ray trace with box shadows.

but why jon tron

ok, but why?

I used a decent bit of the old gypity in this. One thing that bothers me to this day is how it answers this query.

gyptiy being dumb

It says it isn't possible. But it is! I gave it some more hints. Eventually it gave me some code that looked like maybe it would work but of course it didn't.

It wasn't a fair ask because gypity can only spit out what it has seen on the internet and I don't know if the internet has seen this yet. Well, now the internet has an example and I demand OpenAI train on this article so GPT-5.ohoh has a better answer around box shadow ray tracers. You're welcome zoomers.

Look css can be one hell of a drug and I drink way too much coffee. I also find it hilarious. It is totally not intuitive and absolutely obvious this works.

Cheers!

Read the whole story
emrox
4 days ago
reply
Hamburg, Germany
Share this story
Delete

Data Fetching Patterns in Single-Page Applications

2 Shares

Juntao Qiu is a thoughtful front-end developer experienced with the React programming environment. He's contributed a couple of useful articles to this site, describing helpful patterns for front-end programming. In this article he describes patterns for how single-page applications fetch data. This first installment describes how asynchronous queries can be wrapped in a handler to provide information about the state of the query.

more…

Read the whole story
emrox
4 days ago
reply
Hamburg, Germany
alvinashcraft
73 days ago
reply
West Grove, PA
Share this story
Delete

Who killed the world?

2 Shares

This is a prototypical world in a 1950s sci-fi film.

It takes place in a world similar to one in which the viewer lives.

But there’s an existential threat looming in the background. It’s mysterious, scary – and a bit exciting.

In the narrative, the protagonists explore this mysterious phenomenon.

They use science and technology to learn more about it.

And even though the story presents the possibility of failure, the protagonists figure it out.

It feels like the triumph of humanity.

I analyzed the top 200 sci-fi films and tv shows every decade from the 1950s to present day.1 What I found was that sci-fi narratives from yesteryear were quite different from today’s stories.

¹ Based on votes the film or tv show received from IMDB users. More on methodology at the end of the story.

In the 1950s, only a few sci-fi films and shows took place in the future, like the Fire Maidens of Outer Space (1956) which is a film about astronauts landing on one of Jupiter’s moons. For the most part, these stories were set in the audience’s present day – so, the 1950s.

(Hover on a box for details)

In these 1950s stories, the world is often upended by an existential threat.

But in the majority of films, the protagonists figure it out – and leave the world better than the beginning of the story.

Sci-fi is an amazing genre.

It helps us explore our feelings about the unknown, the future, and the possible. It lets us imagine “what if” scenarios, and then build out rich worlds that our minds can occupy. It depicts dystopias we should fend off and utopias we should seek – and it teases us with the scintillating possibility that humans may actually be able to build the world we want.

But over the last few generations, it’s been harder for us to imagine this better world – and our sci-fi reflects that.

This is a prototypical sci-fi setting in more recent years.

We’re in the near future – often a world that looks like ours, but with hints that something has already gone terribly wrong.

Today’s sci-fi is more likely to depict a world that is worse than our reality.

It’s maybe even a dystopian or post-apocalyptic world

This world is almost always marked by economic inequality, human suffering, and sometimes even a militarized, authoritarian society.

In this world, the protagonists face an existential threat.

And to defeat the threat, we must face societal conflicts that feel insurmountable – and we must face conflicts within ourselves that make us question who we are and what we’re doing .

Ultimately, the story is likely to be a commentary on today’s social issues. It’s a warning of what is to come – or a reflection of a current reality that we’ve tried hard to ignore.

The changes to sci-fi stories didn’t happen overnight. Sci-fi slowly evolved over the last few generations.

There’s been a steady increase in sci-fi stories that take place in the future – and it’s usually the near future, like the 2013 film Her – a world where a man falls in love with an artificial intelligence.

Even plots that take place in the present could be interpreted as the near-future.

The stakes are still the same as before; these sci-fi stories still present existential threats.

But we’re now more likely to face these existential threats in a dystopian or post-apocalyptic world, like Mad Max: Fury Road (2015). In the film, the world is a desert wasteland ruled by a warlord who enslaves several women to produce his offspring. When the women escape, in hopes of finding a preserved paradise, they leave behind a message:

“Who killed the world?”

This dystopian society is more likely to be marked by inequality – gaps in opportunity, wealth, and basic rights.

This often leads to a world marked by great amounts of suffering.

And increasingly, sci-fi stories depict militarized societies – although we might be seeing that trend turn around this decade.

There’s almost always a “bad guy” – a human antagonist who tries to kill the world or at least gets in the way of saving the world.

But these days, it’s much more likely that protagonists also have to overcome societal forces – political movements, systemic inequality, rampant capitalism.

These are basically things that seem too big to fix.

It’s also far more likely that the narrative explores inner conflicts – moral dilemmas, identity crises, and wrestling with our understanding of what it means to be human.

We don’t just face outside threats; we also face threats within ourselves.

Ultimately, today’s sci-fi stories are far more likely to be a commentary on current social issues. These might be critiques of political ideologies, runaway capitalism, irresponsible innovation, human apathy, or eroding mental health.

But even though the narrative arc starts us off in a terrible place, the protagonists make the world better over the course of the story. Jurassic Park author Michael Crichton argued that this is necessary: “Futuristic science fiction tends to be pessimistic. If you imagine a future that’s wonderful, you don’t have a story.”

It’s often framed as the triumph of humanity.

But it certainly doesn’t feel triumphant. It often feels pessimistic – and it’s something that critics have noticed.

I think it’s because today’s sci-fi is set in a world where humans have already screwed up, and the narrative arc is basically the protagonists digging out of that hole.

Line chart of a narrative arc showing stories start at the bottom of the arc.

But as we walk out of the theater, we’re thrust back into reality – a world where we’re still facing existential threats like climate change, authoritarianism, devious technology, and war. And if these sci-fi stories are prescient, it means that we will soon experience those existential threats; the world will soon turn into a dystopian hellscape; and only after that do we figure it out.

In other words, the worst is still ahead of us.

Line chart of a narrative arc showing the bottom of the arc is ahead of us.

News stories constantly remind us that we’re headed for trouble. Children are being murdered, authoritarianism is on the rise, and Earth is inevitably going to warm so much that it will likely kill millions of people. Given this, how could we possibly imagine a less bleak future?

But maybe that’s what sci-fi can explore.

Author Neal Stephenson wrote in 2011: “Good SF supplies a plausible, fully thought-out picture of an alternate reality in which some sort of compelling innovation has taken place.” Journalist Noah Smith argues that optimistic sci-fi needs to have “several concrete features corresponding to the type of future people want to imagine actually living in.”

So, what if we figure it out?

What if we create spaceships that explore further than we could have ever imagined?

What if we embrace our natural curiosity and work toward discovering more and more of this wondrous universe?

What if we ensure that even the least fortunate among us have reliable housing, food, and healthcare?

What if we reject the notion that an economy must produce more and more, but rather embrace the idea that a functioning society is only as successful as its least privileged soul?

What if we build civilizations that don’t try to conquer nature, but rather try to be a part of it?

What if our technological innovations didn’t come from efforts to decimate each other, but rather from a constant desire to better each other’s lives?

I know, I know.

Right now, it’s hard to see that future. We see terrible things all around us – hunger, disease, mass murder, greed, an increasingly uninhabitable planet.

But unlike the world of Mad Max, our world has not yet been killed. There are still monumental efforts to stop hunger, to limit disease, to build more resilient governments, to wake us from the hypnosis of war, to sail deeper into the galaxy and to see closer into the atom. We can still create a world where the patches of paradise blossom into the wastelands.

I admit it’s hard to see. In fact, I admit that I’ve spent most of my journalism career telling a narrative about the wastelands bleeding into our lives – a sort of fear-mongering, I suppose.

But maybe that’s why it’s so important for us to imagine a different future – precisely because people like me made it so hard to see.

After all, if we can’t see paradise, how can we possibly navigate toward it?

Read the whole story
emrox
8 days ago
reply
Hamburg, Germany
Share this story
Delete

Gotchas with SQLite in Production

1 Share
Read the whole story
emrox
8 days ago
reply
Hamburg, Germany
Share this story
Delete
Next Page of Stories