Announcing Atlas v0.3.0: A UI-powered schema migration experience

Earlier this week we released v0.3.0 of the Atlas CLI. This version features a ton of improvements to database inspection, diffing and migration planning. You can read about those in the release notes page, but we wanted to take the time and introduce the biggest feature in this release, the Atlas Management UI.

To recap, Atlas is an open source CLI tool that helps developers manage their database schemas. Contrary to existing tools, Atlas intelligently plans schema migrations for you, based on your desired state. Atlas currently has two main commands: inspect and apply. The inspect command inspects your database, generating an Atlas HCL document. The apply command allows you to migrate your schema from its current state in the database to your desired state by providing an HCL file with the relevant schema.

In this post we will showcase the latest addition to the CLI’s feature set, the Management UI. Until now, you could use Atlas to manage your schemas via your terminal. While this is the common interface for many infrastructure management workflows, we believe that a visual, integrated environment can be beneficial in many use-cases.

Inspecting our database using the Atlas UI

Let’s see how we can use the Atlas UI to inspect our database.

For the purpose of demonstration let’s assume that you have a locally running MySQL database. If you want to follow along, check out the Setting Up tutorial on the Atlas website for instructions on starting up a MySQL database locally using Docker.

We will be working with a MySQL database that has the following tables:

CREATE table users (
    id int PRIMARY KEY,
    name varchar(100)
);
CREATE TABLE blog_posts (
    id int PRIMARY KEY,
    title varchar(100),
    body text,
    author_id int,
    FOREIGN KEY (author_id) REFERENCES users(id)
);

To inspect the database, we can use the atlas schema inspect command. Starting with this version, we can add the -w flag to open the (local) web UI:

atlas schema inspect -d "mysql://root:pass@tcp(localhost:3306)/example" -w

Our browser will open automatically, and we should see this output in the CLI:

Atlas UI available at: http://127.0.0.1:5800/projects/25769803777/schemas/1
Press Ctrl+C to stop

inspect_image

We can see that our schema has been inspected, and that it’s currently synced. On the bottom-left part of the screen the UI displays an ERD (Entity-relation Diagram) showing the different tables and the connections between them (via foreign-keys). On the bottom-right, we can see the current schema, described using the Atlas DDL. In addition, on the top-right, we see the “Activity & History” panel that holds an audit history for all changes to our schema.

Migrating our database schema with the Atlas Management UI

Visualizing the current schema of the database is great, let’s now see how we can use the UI to initiate a change (migration) to our schema.

Click on the Edit Schema button in the top-right corner and add the following two tables to our schema:

table "categories" {
  schema = schema.example
  column "id" {
    null = false
    type = int
  }
  column "name" {
    null = true
    type = varchar(100)
  }
  primary_key {
    columns = [table.categories.column.id, ]
  }
}

table "post_categories" {
    schema = schema.example
    column "post_id" {
        type = int
    }
    column "category_id" {
        type = int
    }
    foreign_key "post_category_post" {
        columns     = [table.post_categories.column.post_id, ]
        ref_columns = [table.blog_posts.column.id, ]
    }
    foreign_key "post_category_category" {
        columns     = [table.post_categories.column.category_id, ]
        ref_columns = [table.categories.column.id, ]
    }
}

Click the Save button and go back to the schema page. Observe that a few things changed on the screen:

The UI after saving

First, we can see that the UI states that our schema is “Out of Sync”. This is because there is a difference between our desired schema, the one we are currently working on, and the inspected schema, which is the actual, current schema of our database.

Second, we can see that our ERD has changed reflecting the addition of the categories and post_categories tables to our schema. These two tables that have been added are now shown in green. By clicking the “expand” icon on the top-right corner of the ERD panel, we can open a more detailed view of our schema.

ERD displaying diff

Going back to our schema page, click the “Migrate Schema” to initiate a migration to apply the changes we want to make to our schema. Next, Atlas will setup the migration. Click “Plan Migration” to see the migration plan to get to the desired schema:

Migration Prep

Atlas displays the diff in the schema in HCL on the left pane, and the planned SQL statements on the right. Click “Apply Migration” to begin executing the plan.

Migration Plan

In the final screen of the migration flow, Atlas displays informative logs about the migration process. In this case, our migration completed successfully! Let’s click “Done” to return to the schema detail page.

Applying Migration

As expected, after executing our migration plan, our database and desired schema are now synced!

Post Migrations

Wrapping Up

In this post, we’ve introduced the Atlas Management UI and showed one of the possible workflows that are supported in it. There’s much more inside, and we invite you to install it today and give it a try.

What next?

Meet Atlas CLI: Inspect and Apply changes to your database schema

At Ariga, we are building a new kind of platform that we call an Operational Data Graph. This platform enables software engineers to manage, maintain and access complex data architectures as if they were one database. Today, we are open-sourcing a CLI for Atlas, one of the fundamental building blocks of our platform.

During my career, the scope of what is expected of me as a software engineer has increased significantly. Developers are no longer expected just to write code, we are expected to provision infrastructure, manage databases, define deployments and monitor systems in production.

Nowadays, one of the responsibilities we have as software engineers is to manage the database schema of our applications. Once seen as falling strictly under the domain of DBAs, today developers everywhere are responsible for defining database schemas and changing them over time. Because an application’s database carries its state, all clients and servers are severely impacted if it stops functioning properly. Therefore, over the years many techniques and tools were developed to deal with this process, which is called migrating the database.

In the last few years we have seen a lot of progress in the field of tools for provisioning infrastructure. From early projects such as Chef and Puppet, to more recent work such as Terraform, a lot of thought and effort has been put across the industry to build tools that simplify and standardize the process. Instead of manually installing and configuring software and services, the common thread between all of these projects is that they are based on machine-readable definition files, a concept also known as infrastructure-as-code (IaC).

Enter: Atlas

Atlas is at the core of Ariga’s platform. In this post, I would like to share with you the work we’ve done so far to provide a solid foundation for managing databases in a way that’s akin to infrastructure-as-code practices.

  • The Atlas DDL (Data-definition Language): we have created the Atlas DDL, a new configuration language designed to capture an organization’s data topology - including relational database schemas. This language is currently described in an HCL syntax (similar to TerraForm), but will support more syntaxes such as JSON and TypeScript in the future. The Atlas DDL currently supports defining schemas for SQL databases such as MySQL, Postgres, SQLite and MariaDB, but in the future, we plan to add support for other types of databases. For example:
table "users" {
  schema = "default"
  column "id" {
    type = "int"
  }
  column "name" {
    type = "string"
  }
  column "manager_id" {
    type = "int"
  }
  primary_key {
    columns = [
        table.users.column.id
    ]
  }
  index "idx_name" {
    columns = [
      table.users.column.name
    ]
    unique = true
  }
  foreign_key "manager_fk" {
    columns = [table.users.column.manager_id]
    ref_columns = [table.users.column.id]
    on_delete = "CASCADE"
    on_update = "NO ACTION"
  }
}
  • The Atlas CLI On top of the building blocks provided by the DDL, we started building our CLI tool to support the two most basic functions:

    • “Schema Inspect” - Create a schema specification file from a database.
    • “Schema Apply” - Migrate a database to a new desired state.

Many infrastructure-as-code projects have taken the declarative approach, in which the developer articulates the desired state of the system and the tool is responsible for figuring out a plan to get there. As we discussed above, changing database schemas safely is a delicate practice, so we had to build the Atlas CLI to be smart enough to understand the nuance of changes for each type of database.

Atlas in action

Let’s see how Atlas CLI works with real databases. Let’s start a MySQL container:

docker run --name atlas-db  -p 3306:3306  -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=example   mysql:8.0.27

Connect to our database using a native client to validate:

docker  exec -it  atlas-db  mysql --password='pass' example
mysql> show tables;
Empty set (0.00 sec)

mysql>

Let’s see how Atlas inspects it:

atlas schema inspect -d "mysql://root:pass@tcp(localhost:3306)/example" > atlas.hcl

As expected, an empty schema:

# cat atlas.hcl
schema "example" {
}

Let’s update our schema to:

# cat atlas.hcl
table "users" {
  schema = "example"
  column "id" {
    null = false
    type = "int"
  }
  column "name" {
    null = false
    type = "string"
    size = 255
  }
  column "manager_id" {
    null = false
    type = "int"
  }
  primary_key {
    columns = [table.users.column.id, ]
  }
  foreign_key "manager_fk" {
    columns     = [table.users.column.manager_id, ]
    ref_columns = [table.users.column.id, ]
    on_update   = "NO ACTION"
    on_delete   = "CASCADE"
  }
  index "idx_name" {
    unique  = true
    columns = [table.users.column.name, ]
  }
  index "manager_fk" {
    unique  = false
    columns = [table.users.column.manager_id, ]
  }
}
schema "example" {
}

And apply our changes!

atlas schema apply -d "mysql://root:pass@tcp(localhost:3306)/example" -f atlas.hcl



-- Planned Changes:
-- Add Table : users
CREATE TABLE `example`.`users` (`id` int NOT NULL, `name` varchar(255) NOT NULL, `manager_id` int NOT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `idx_name` (`name`), CONSTRAINT `manager_fk` FOREIGN KEY (`manager_id`) REFERENCES `example`.`users` (`id`) ON UPDATE NO ACTION ON DELETE CASCADE) ;
Use the arrow keys to navigate: ↓ ↑ → ←
? Are you sure?:
  ▸ Apply
    Abort

Of course we are sure !

Using CLI to examine our database:

mysql> describe users;
+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| id         | int          | NO   | PRI | NULL    |       |
| name       | varchar(255) | NO   | UNI | NULL    |       |
| manager_id | int          | NO   | MUL | NULL    |       |
+------------+--------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

mysql>

Let’s make sure that it has the FK:

mysql> show create table users;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                            |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| users | CREATE TABLE `users` (
  `id` int NOT NULL,
  `name` varchar(255) NOT NULL,
  `manager_id` int NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_name` (`name`),
  KEY `manager_fk` (`manager_id`),
  CONSTRAINT `manager_fk` FOREIGN KEY (`manager_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

Now let’s see that Atlas inspects this correctly:

atlas schema inspect -d "mysql://root:pass@tcp(localhost:3306)/example" > atlas.hcl
# cat atlas.hcl
table "users" {
  schema = "example"
column "id" {
    null = false
    type = "int"
  }
  column "name" {
    null = false
    type = "string"
    size = 255
  }
  column "manager_id" {
    null = false
    type = "int"
  }
  primary_key {
    columns = [table.users.column.id, ]
  }
  foreign_key "manager_fk" {
    columns     = [table.users.column.manager_id, ]
    ref_columns = [table.users.column.id, ]
    on_update   = "NO ACTION"
    on_delete   = "CASCADE"
  }
  index "idx_name" {
    unique  = true
    columns = [table.users.column.name, ]
  }
  index "manager_fk" {
    unique  = false
    columns = [table.users.column.manager_id, ]
  }
}
schema "example" {
}

Let’s see what happens when we try to reapply the same change:

atlas schema apply -d "mysql://root:pass@tcp(localhost:3306)/example" -f atlas.hcl
Schema is synced, no changes to be made

In this example we have shown how we can inspect a MySQL database schema and apply a change.

What’s Next?

The Atlas DDL opens up a world of tools and services, and with the help of our community, we are planning to push the development ecosystem forward. A list of tools that are on our road map includes:

  • Integrations with Terraform, GitHub actions and Kubernetes.
  • Extended migration logics such as renaming columns, adding or dropping nullability and altering enums.
  • Toolsets for examining the migration history and reproducing it.

We hope that you find Atlas CLI as exciting as we do, and we invite you to contribute your ideas and code.

Two GitHub features that make bug reporting and triaging easier than ever

At Ariga, our software solution is heavily built on top of ent. Ent is a simple, yet powerful entity framework (ORM) for Go, that makes it easy to build and maintain applications with large data-models. Ent makes it possible to define any data model or graph-structure in Go code easily. Ent generates an idiomatic and statically-typed API for working with databases that keeps Go developers productive and happy. As we rely on Ent for building our Operational Data Graph platform, we are deeply invested in the project and are committed to making it and its community successful.

In this blog post, we would like to share with you our success story using two GitHub tools, template repositories and Codespaces, that help us maintain Ent by improving contributor productivity and the overall velocity of the project.

Our Story

As a project maintainer, it is in your best interest to ensure that your users are happy and receiving the help they need. By addressing bug reports and solving them, you are able to improve the quality of the project as well as assist users. The optimal way to receive a bug report is in the form of a reproducible example. Generally, this requires receiving code that is runnable in order to see a live example of the problem.

On GitHub, it is common practice to ask the user to open an issue and provide a temporary repository with the code to reproduce the bug. In Ent, there is a special bug reporting issue template that guides the user on how to properly describe the problem. Using the issue template makes the project maintainer’s life easy by ensuring that each issue includes all the necessary information and is written in an organized manner.

Once the project maintainer gets a hold of the repository, they can locally clone it and start debugging, rather than create a new project from scratch to try and reproduce the issue. However, this method can be risky. Although most users are friendly and don’t have bad intentions - you can never trust too easily. You don’t know what’s in the code you are about to execute and what harm it may do. It is always possible to check the code, or to run it in an isolated environment, yet it takes time and effort to go to these lengths to ensure your safety.

In August 2021, GitHub released a new feature, Codespaces. GitHub Codespaces is a tool that offers cloud-powered shared development environments right from within a web browser, giving access to an entire Visual Studio Code experience. Directly from any repository you are able to run your code without any installations, as if you were fully configured and running on your own local machine.

Soon after Codespaces was released, Ariel (Ariga’s co-founder and Ent’s creator) decided to give it a try. He opened a repository provided by a user, intending to debug an issue, straight in Codespaces (you can view the issue here). With a click of a button, without any prior setup needed, he was in a completely ready-to-go environment and was able to start debugging right away! Within a minute, the bug was triaged and Ariel was able to get back to the user with the answer. This is any project maintainer’s dream! You want to be able to address as many issues as possible, diligently, and in a timely manner, without having to risk your own safety in the process.

issue image

Taking this idea a step further, I thought about using template repositories to refine bug reporting even more. Template repositories is a feature designed to “make boilerplate code management and distribution a first-class citizen on GitHub”. These repositories are used as templates across multiple projects, for simple code reuse and enable developers to easily create new repositories with the same directory structure, branches and files. Generally, template repositories are used as a way to kickstart projects quickly, however the Ent project came up with another use for them.

Ent launched the new ent/bug template repository. This template has Ent pre-installed, a basic user schema, and test connections to different databases. Now users can easily create a new repository from the bug template and add in their code. This way, users can share the necessary code for reproducing the bug, without having to spend much time on setting up a new project.

By utilizing the strengths of both Codespaces and template repositories, we have made both bug reporting and triaging much easier tasks. Using both of these tools is a win-win situation for project maintainers and the users alike. Project maintainers can now significantly cut down the time required for setting up environments to triage bugs, making the entire process much smoother, faster and safer. Similarly, users now have the ability to save time and effort and are able to consistently provide reproducible bug reports.

We highly encourage you to try using these tools in your own project and see what life changers they are!

Feel free to find me on the the #ent channel on the Gophers slack workspace if you want to discuss this topic or ask any further questions, I’d love to hear from you!

Data access should be an infrastructure problem

The dire state of data access

Modern cloud-native software systems are characterized by an abundance of data systems. It sometimes seems that many companies, especially those that adopt a microservices-oriented architecture early on, have more databases than engineers to manage them.

Much has been written about the complexity and overhead that is induced by adopting a microservices architecture. In fact, an entire landscape of cloud-native projects and companies has risen in the wake of the microservices revolution. New fields such as container orchestration, service mesh, and distributed tracing were created to deal with the sprawling complexity of these architectures.

The proliferation of data systems in typical cloud-native applications is a direct consequence of widely accepted design principles such as “Database per Service”. In addition, performance and cost considerations for applications running at a large scale led to the adoption of many specialized data storage technologies (such as in-memory key-value, document, and object/blob stores) that are now running side-by-side with more traditional relational databases. Furthermore, many applications opt to read and write data from third-party APIs that abstract the concrete storage technology away from them.

With his plethora of data technologies that a single application must access, comes the direct consequence that data access becomes very complex. Functionality that has been supported by relational database engines for decades like reading, writing, or deleting a record, support for transactions (performing multiple changes that must completely succeed or fail), and performing joins between different collections of data, have all now become distributed systems problems, that are notoriously hard to design and implement correctly.

Like it or not, microservices architectures, along with their accompanying disintegrated data model are here to stay. Even if we see a return of monolithic deployments come back to fashion (e.g. “The Majestic Monolith"), it is very hard to imagine the same happening on the data infrastructure layer.

Technical infrastructure

Wikipedia defines Infrastructure as “the set of fundamental facilities and systems that support the sustainable functionality of households and firms. Serving a country, city, or other area, including the services and facilities necessary for its economy to function.” In other words, infrastructure is the common layer of systems that we need to sustain the functionality of our lives and companies. We cannot even begin thinking about running a successful business in our society without all of the fundamental infrastructures that modern cities provide us with: electricity, fresh water, waste removal, roads and transportation, communication networks, etc.

The evolution of software engineering can be thought of as a continuous process of taking the most common hard problems and turning them into technical infrastructure. Consider some seminal technical projects from recent years: “public cloud” services (such as AWS EC2 or Google Cloud), data infrastructure (such as Apache Spark or Kafka), networking, and messaging (such as gRPC or Envoy). The common thread between these technologies is that they have all become the technical infrastructure for today’s applications, they are the water lines, sewage systems, and electricity grids of our industry.

The Operational Data Graph

We founded Ariga out of frustration from the immense complexity of building software that is backed by a disintegrated data model. Such software is complex to design, build and manage in production; it is complex to understand, secure, and govern. Ariel and I have always been obsessed with improving the process of building software, looking for ways to push the envelope on how to make hard things simpler, safer, and cheaper.

Together we steward Ent, a Linux Foundation-backed project that is loved by software engineers in companies of all sizes and shapes. Ent is an entity framework for Go that provides application developers with robust, privacy-aware APIs for working with their data. As we near our 10,000th stargazer on GitHub, we are looking to expand Ent’s success to more programming languages and distributed data architectures.

With our team of hard-core data and infrastructure engineers, we are working on a platform that is focused on providing an amazing developer experience that enables developers to manage, access, and maintain a complex data architecture as if it were a single database. We call it the Operational Data Graph.

This platform provides companies with a simple way to build and maintain data topologies that capture all of the company’s data systems in one connected graph. Next, the platform unlocks management capabilities to safely evolve the structure and schemas of the different components in the graph. Finally, developers can automatically expose different parts of the graph as APIs in many popular standards (such as REST/OpenAPI, GraphQL, gRPC, etc.) with our high-performance, privacy-aware access layer.

With this platform, we hope to usher in an era in software engineering where data access becomes an infrastructure problem, allowing developers to once again focus on providing value to their users instead of solving the same distributed systems problems again and again.