Breaking the Monolith Step by Step

Этот же пост на русском

Let's talk about modularization. It's probably been a hot topic in the context of Android development for about five years now. During this time, I've had the opportunity to work on several large projects where modularization was in process. Sometimes it went well, sometimes not so well, but I've seen a lot. I want to share this experience in a post with my thoughts on how it should be done.

Why do we need to break the project into modules?

Perhaps there is no need. A monolith is better than a poorly separated project. It's better in terms of build time and ease of use.

Modularization is needed if:

  • You want clear separation of concerns
    • Among developers or teams
    • In the code
  • You are experiencing problems with build speed

But it needs to be done thoughtfully and very well-prepared.

Monolith

What is a monolith? Usually, it's quite easy to identify from the first sight. If you have one app module, that's it. If there are already many of them, and it's not immediately clear which one is the largest, Gradle Scan can answer the question. If you see that a dozen modules are waiting for one to build, it's probably it. This is our monolith from which we want to extract pieces.

Gradle Scan

In this scan, you can see two issues - building the monolith and minifying the application using R8. Unfortunately, we can't really do much with minification, but we'll try to deal with the monolith.

But this is only from the build time point of view. If we talk about separation of concerns, everything depends on your team. Conway's law still is the thing. The project will naturally be divided into areas of responsibility of project teams. That is, you shouldn't need to focus only on the build timeline, it's also important for you to create modules that will match your organizational structure, so it could be convenient to manage them.

Gradle Scan perfectly shows how your build works. Everything that can be executed in parallel should be executed in parallel. That is, the ideal example of a project from the build time performance is an app module that depends on multiple modules that do not depend on each other.

It's clear that this is just a dream, because features will want to depend on each other sooner or later, and we create some modules to reuse between various other modules by design.

The worst example of an application in terms of this performance is an app module that depends on a chain of many modules that depend on each other.

At this point, it's worth mentioning such a concept as the critical path. The critical path is the longest path from the app module to some other node in the graph of module dependencies. Our goal is to keep its length within some reasonable limits. That is, the longer the critical path, the worse on average. This is not necessarily the case, one large module can take longer to build than many others in a chain. But as one of the guidelines, it's worth keeping the length of the critical path in mind.

Silver Bullet

I may disappoint someone, but you will be stuck in one problem after another, which is very specific to you. Therefore, it's impossible to make a conference talk or post on this topic that will be applicable to your project.

The key thought I want to convey is that the ideas of modularization are quite simple. The main difficulty is in your project legacy. Remember, everything depends on your project and your team. All I can do is suggest some steps to start with.

Steps

Clean up Gradle

For some reason, on most of the projects I've seen, Gradle configs have been a mess that no one keeps track of. If you haven't clean up your Gradle, you should not start modularization at all. With each new module, it will get more complicated.

First, switch from the Android tab to the Project tab in Project Navigator. If you don't see some files, it doesn't mean they don't exist. Keep track of what's in each module. And configure Gradle just like you write code. DRY, KISS, and all of that.

Imagine for a second that we will have many modules someday, where dependencies will be hardcoded, versions and other basic plugin settings are specified in each module. Hard to maintain, right?

Our own Gradle plugins and dependency catalogs can help us with this. I won't go into deep details, as there are enough materials on this topic. I'll just say these keywords: composite builds, convention plugins and version catalog. These are your best friends if you want to make Gradle configs nice and have all the logic somewhere in one place.

Define Module Types

You can find a dozen excellent materials on this on the Internet. For example:

  • :app - thin modules from which apk/aab is created, app modules prepare a dependency graph and essentially do nothing more
  • :feature:api - thin modules with a public feature interface for external use. Dependencies interface of this feature is also defined here
  • :feature:impl - thick modules implementing interfaces from :feature:api, can only depend on other :feature:api modules, but not on :feature:impl implementations of other features
  • :core - small utility modules separated by their concerns

Module types are needed to define which modules can depend on others and which cannot. These types help us keep the critical path length small and large modules stop depending on other large modules, also speeding up the build.

Thanks to these types, our project (module dependencies graph) will grow fast in terms of width, but slow in terms of depth. That is, even with a large amount of code, the build time of the project will remain normal.

Generate Modules

Let's be honest, out of the box we have three ways to create modules:

  • Use New - Module in Android Studio
  • Create a folder yourself, specify the module in settings.gradle
  • Copy an existing module, specify it in settings.gradle

All these methods are awful, you either generate a lot of unnecessary stuff, or something necessary still needs to be added manually. Why don't we automatically generate modules exactly as we need, with all our rules and agreements?

Things are getting a bit complicated, aren't they? Not really, the biggest inconvenience in modularization for developers is creating modules. If it's hard to create modules then nobody will do it, if it's easy - there's a chance. Give your developers a convenient tool.

You need templates. A long time ago, Android Studio could work with FreeMarker templates out of the box, but it is forgotten now. So you have to look for third-party solutions.

For example, the Android Studio plugin Geminio allows you to take these templates and create modules from them through a familiar interface in Android Studio.

We took a different path and wrote our own Gradle task that creates the :core:network (for example) out of nowhere with every given plugin, dependencies and package structure provided.

gradle createModule --Pmodule=network

Similarly, with features. This task will create :feature:settings:api and :feature:settings:impl from FreeMarker templates, and will also add them to settings.gradle.

gradle createFeature --Pmodule=settings --feature

Why our own solution? Because we didn't want to bring a dependency on a third-party plugin for the studio to our project. Otherwise we would have to explain to everyone how to install and configure it. Plus, in Geminio it's impossible to create multiple modules at once and in general, there is also a lack of control on what and where exactly generated.

Clean Up the App Module

If your app module is a monolith, then move all its content to some other library module. Let it be the monolith we are fighting with. The app should be very thin. Its purpose is only to assemble dependencies and produce the final bundle.

The implementation of the Application class should be left in the app, as it is the place we decide how to assemble our DI graph.

In an average project, Application is often a mess, strongly connected with the rest of the code, so separating your monolith code from your Application class is also a job that will have to be done. If you are using your Application subclass in the code directly, stop doing it. Use the generic Application from the Context. If you really need to refer to your class, it will be impossible if it remains in the app, and all the code moves into lower-level modules. The only thing you can do in this situation is to extract the interface somewhere in lower levels of the hierarchy and cast Application from the Context to this interface.

A side effect of the app no longer being the largest module is that you get the ability to create thinner application modules for specific parts of the application, specific features, or for customizing applications without using flavors.

If you have flavors running through your entire project, like multiple modules, consider abstracting it all away and moving the resolution logic as high up the module graph as possible. Flavors have a very negative effect on configuration time, not to mention the fact that flavors are a rather awkward tool, and the code with flavors is quite inconvenient to maintain.

Prepare DI

First, a little note on the Dependency Inversion Principle (D from SOLID). This is perhaps the most important principle when working with modularization. Often you will encounter the fact that some of your classes depend on each other, and it is impossible to extract them into separate modules individually. Actually, you can, remember what DIP says? Depend on abstraction, not on implementation. We create an interface, depend on the interface everywhere, and at the app level (most often) we link the interface with the implementation.

Frameworks. If you have Dagger and everything is on subcomponents, then I have bad news for you. If you have Hilt, then you have Dagger on subcomponents, only implicitly. Hilt has a very hard time with multiple modules because of this, and Dagger at least has options - you can refactor your Subcomponents to Components. With pain and suffer, but you can.

Now we have Koin, and we don't worry too much about the architecture of DI modules. In essence, we follow the same rule - the feature module is defined in the feature module, and in the app we attach it to the main graph.

Why don't we worry too much? Because we are looking towards Decompose, which closes both DI and navigation requirements for us in pure Compose without Fragments. But that's a completely different story.

An interesting side effect of modularization - in each separate feature you can use any approach for DI which you want, even manual DI through constructor parameters becomes quite simple and convenient.

Prepare Navigation

You need to abstract as much as possible from the navigation library. We agreed that the Api of the feature should be simple and not dependent on platform stuff. This means no Intent, Fragment, Context, and so on. This means that the implementation of the feature itself (Impl) should be able to open itself using external dependencies. The Router from Cicerone could be such abstraction.

We are now moving from Activity/Fragment navigation using Cicerone to Compose navigation using Decompose, but the principles do not change.

For example, in Decompose, the component interface should be in :feature:api, and its implementation in :feature:impl, and it is absolutely clean and do not depend on platform types.

Keep in Touch with Developers

Most likely, if you are thinking about modularization, you are not working alone. I have already listed quite a few things that can change the usual way of life of developers on your project. Have syncs, constantly tell and make presentations about what is happening on the project and have clear destination you are all moving to.

Write documentation, so they can refer to it for additional information to understand how to adapt to the new ideas. And not only they, future you will thank yourself. Documentation written in advance can be a good guide in this adventure.

A Word About Dynamic Features

If you have dynamic features, think carefully whether you really need them, because their implementation assumes that they should stand above the app in the module graph, which is quite absurd from an architectural point of view and prevents us from making the app thin.

I do not want you to think that I hate dynamic features, so I'll note that you can live with them.

How to get rid of dynamic features is a topic for a separate article or talk. You should change your mind a bit to create normal library module from dynamic module.

Extract Core Aggressively

Extract reusable code into the core module everytime you can. Remember, it's much easier to extract small independent libraries than large features with a bunch of dependencies. And even more so, you won't be able to extract features from the monolith without extracting all the core libraries first that the features depend on. You probably have a ton of such dependencies: analytics, base fragments/activities, UI components, network code. The more is extracted into modules lower in the hierarchy, the easier it will be for you to extract features.

Write new core code right away in new modules - it's simple.

Extract Features Cautiously

Writing new features in new modules is quite difficult, for the reason that old features are often written using that legacy spaghetti, which is quite difficult to extract from the monolith at once. Therefore, don't demand this from developers. Here you need a plan on how to first extract the legacy spaghetti, without which it will be difficult to write features. This is the most painful part of refactoring.

Act Iteratively

There is no need to undertake the extraction of everything at once. The temptation to refactor "also this and this" is quite large. Stop at the pre-planned milestones. But remember that the process is complex, and you are learning so much in this process. At the beginning of any large-scale refactoring, you are much more optimistic than in the middle, when you have already encountered problems. Gradually, you will solve these problems and move forward with a much more thoughtful solution overall.

Algorithm

How to extract any code or resources into a separate module:

  1. Analyze what code depends on them
  2. Analyze what code and resources your code depends on
  3. Create a new module for your code
  4. Add a dependency on the new module in the modules that use this code
  5. Move all the code and resources used in multiple modules to a new module even lower in the hierarchy by the same algorithm
  6. Move the code to the created module

Progress

Thus, your progress over time will look like this:

Algorithm

  1. First, you have one monolith
  2. Then you separate the monolith from the thin app module
  3. Understand what are feature and what are core inside the monolith
  4. Extract core, which feature depends on, into separate modules
  5. Extract feature into separate modules
  6. Repeat from step 3

Conclusions

Modularization is difficult and not everyone needs it. But if you started it, it's important to do it right, otherwise it will be even more painful than it was. I hope I managed to convey the main thoughts and approaches. It turned out to be such a compilation of a many materials and personal experience on large maturing projects, with a minimum of technical details, as I wanted, but with something to think about.

I will be happy to answer your comments and messages. Thank you for your attention.

References