Continuous integration systems play a crucial role in keeping software working while it is being developed. The basic steps most continuous integration systems follow are:

1. Get the latest copy of the code.
2. Run all tests.
3. Report results.
4. Repeat 1-3.

This works great while the codebase is small, code flux is reasonable and tests are fast. As a codebase grows over time, the effectiveness of such a system decreases. As more code is added, each clean run takes much longer and more changes gets crammed into a single run. If something breaks, finding and backing out the bad change is a tedious and error prone task for development teams.

Software development at Google is big and fast. The code base receives 20+ code changes per minute and 50% of the files change every month! Each product is developed and released from ‘head’ relying on automated tests verifying the product behavior. Release frequency varies from multiple times per day to once every few weeks, depending on the product team.

With such a huge, fast-moving codebase, it is possible for teams to get stuck spending a lot of time just keeping their build ‘green’. A continuous integration system should help by providing the exact change at which a test started failing, instead of a range of suspect changes or doing a lengthy binary-search for the offending change. To find the exact change that broke a test, we could run every test at every change, but that would be very expensive.

To solve this problem, we built a continuous integration system that uses dependency analysis to determine all the tests a change transitively affects and then runs only those tests for every change. The system is built on top of Google’s cloud computing infrastructure enabling many builds to be executed concurrently, allowing the system to run affected tests as soon as a change is submitted.

Here is an example where our system can provide faster and more precise feedback than a traditional continuous build. In this scenario, there are two tests and three changes that affect these tests. The gmail_server_tests are broken by the second change, however a typical continuous integration system will only be able to tell that either change #2 or change #3 caused this test to fail. By using concurrent builds, we can launch tests without waiting for the current build/test cycle to finish. Dependency analysis limits the number of tests executed for each change, so that in this example, the total number of test executions is the same as before.




Let’s look deeper into how we perform the dependency analysis.

We maintain an in-memory graph of coarse-grained dependencies between various tests and build rules across the entire codebase. This graph, several GBs in-memory, is kept up-to-date with each change that gets checked in. This allows us to transitively determine all tests that depend on the code modified in a given change and hence need to be re-run to know the current state of the build. Let’s walk through an example.

Consider two sample projects, each containing a different set of tests: 



where the build dependency graph looks like this:


We will see how two isolated code changes, at different depths of the dependency tree, are analyzed to determine affected tests, that is the minimal set of tests that needs to be run to ensure that both Gmail and Buzz projects are “green”.

Case1: Change in common library

For first scenario, consider a change that modifies files in common_collections_util.




As soon as this change is submitted, we start a breadth-first search to find all tests that depend on it.



Once all the direct dependencies are found, continue BFS to collect all transitive dependencies till we reach all the leaf nodes.



When done, we have all the tests that need to be run, and can calculate the projects that will need to update their overall status based on results from these tests.





Case2: Change in a dependent project:

When a change modifying files in youtube_client is submitted.




We perform the same analysis to conclude that only buzz_client_tests is affected and status of Buzz project needs to be updated:



The example above illustrates how we optimize the number of tests run per change without sacrificing the accuracy of end results for a project. A lesser number of tests run per change allows us to run all affected tests for every change that gets checked in, making it easier for a developer to detect and deal with an offending change.

Use of smart tools and cloud computing infrastructure in the continuous integration system makes it fast and reliable. While we are constantly working on making improvements to this system, thousands of Google projects are already using it to launch-and-iterate quickly and hence making faster user-visible progress.

- Pooja Gupta, Mark Ivey and John Penix
46

View comments

  1. Extremely interesting article. But there isn't a mention of a specific programming language, or is that not a limitation of the "system's dependency analysis"? If this 'dependency analysis' works on a C++ code-base, I for one would love to get more details. Or do I need to join Google for that :-)

    ReplyDelete
  2. As we perform dependency analysis at build rule level, it is language agnostic as long as code base follows the convention of defining build rules. Send us your resume anyway? :-) Find more details at: http://www.google.com/intl/en/jobs/index.html

    ReplyDelete
  3. That seems to mean that a hidden depdency - buzz depends on youtube_server for example but it is not listed in the manual dependencies would cause you to misdiagnose the offending changelist.

    Also, this system does not count sporadic failures - what happens if a test is flaky and fails sometimes?"

    ReplyDelete
  4. Thanks @poojagupta. Are you allowed to give us more information on what the build-rule looks like?

    Is the dependency-analysis specific to the build tool Google uses (is that even publicly known?) or is it something that can be practically applied to any build-tool (e.g. CMake).

    From the outside it seems like this tool, that you describe, can be a product in its own right. Any chance Google would sell it?

    Thanks in advance.

    ReplyDelete
  5. @JohnM The change that made buzz depend on youtube_server, but didn't add the proper dependency to the build rule would cause a build failure for buzz. You are correct that any change to youtube_server while the buzz dependencies were broken would be missed - but this isn't a state that lasts long. I've never heard of this happening.

    ReplyDelete
  6. @Ganeshram, The build-rule is specific to the build tool used at Google, which is not publicly known. Currently this system is tightly coupled with the rest of developer infrastructure at Google and not available as a product.

    I'm not familiar with CMake, but looking at example in http://www.cmake.org/cmake/help/examples.html, I can imagine a dependency tree built on top of CMake's target_link_libraries, which seems to define the dependencies.

    ReplyDelete
  7. So did this get built in from the beginning or was it created later after the code base started getting large and awkward? I'd like to get a sense of the challenges and time involved in creating this against an existing code base if that's something you have experience with.
    Also does this help reduce intermittent test failures as tests are much less likely to have hidden dependencies?

    ReplyDelete
  8. It's an interesting idea, applying the dependency management to the test runs and not only the builds.

    But I have a question: in the first example you can track down that both change #2 and change #3 are breaking gmail_server_tests independently.

    But how do you go back in the version control system and fix those changes individually? Do you go back to pre-change #2 and then serialize the (fixed) changes?

    Another question is that now you can know earlier if it was change #2 or #3 that broke the tests, but you will not know (till later) if the combination of change #2 and #3 (also) breaks the tests.

    Is this a tradeoff you agree upon since maybe individual changes breaking the build are more common than combination of changes?

    Thanks for the interesting post!

    ReplyDelete
  9. I'm not sure I follow your first question. When we know both change #2 and #3 are breaking the test independently, a developer can make either a single change to fix both problems, or two separate changes. If for example the issue caused by change #2 is severe they can rollback change #2 and submit a change fixing the problem caused by change #3.

    For second question, change #2 and change #3 are likely to be submitted by different developers breaking the same test in different ways, hence they would probably fix their breakage separately. Also as you mentioned individual changes breaking the build are more common than combination of changes.

    ReplyDelete
  10. Just to answer Lukas Blakk's question:

    The system was built after the code base was already quite large (and still growing fast) and the cost of traditional continuous builds was getting to be a burden. For the server tracking dependency information, there were already files in pace containing dependency information so we just had to make something to read them all in and build up the graph. The work for creating the graph with sufficient information at the nodes isn't super hard if you have all of the edge information handy but dealing with the updates, checkpointing for error recovery, ... can definitely take more time.

    As for flaky tests, those are a problem in and of themselves. Running them only when they could be affected reduces the noise but that's about it. It does make it easier to preemptively see what tests you should run to verify a change going in so in can be used to reduce surprising breakages.

    ReplyDelete
  11. I've been thinking about similar thing recently - these dependencies (even better to track also dependencies between tests) could be used to build something like "test case priorities"...

    Example: there is a test case A, which depends on test case B (A tests feature AA which depends on feature BB, tested by B)...

    Then if BB is broken, both B and A tests fail, but A doesn't test BB directly.

    The idea is to display the results with sort of priority: B has higher priority than A, as fixing BB will fix both tests...

    The main idea is to help find the real problem faster. Of course this is not related to pure unit tests which should not rely on other components.

    ReplyDelete
  12. Thanks for this post - great read!

    In Case 1:
    Aren't we missing common_collections_util_tests? Wouldn't it be quicker to execute this first before executing any test from upstream dependencies?

    ReplyDelete
  13. @Mathias

    Good catch. Yes there should also be common_collections_util_tests which will also get run for case 1.

    With the current architecture, we do not prioritize some tests over the other for a single change. There is definitely scope for improvement here. e.g. by proximity as you suggested, or say run all small+stable+fast tests before running large+flaky tests.

    Good suggestion.

    ReplyDelete
  14. How do you know which test tests which feature?

    ReplyDelete
  15. This system does not know anything about features in a product. That is up to the respective teams.

    ReplyDelete
  16. Hi, I went through your this post and found really interesting. I landed this page through internet search and found
    this page very good.I am a web designer and works in a web design company
    as a designer I liked the color of your blog also.

    Thanks,

    Saroda Swami

    ReplyDelete
  17. Interesting!! I am trying to do something similar to save on time it takes to test the whole system. I want to know how do you maintain the dependency matrix. To elaborate if some new component is added into my code base do i manually go and change the matrix and then reload the matrix on the memory or are you using some tool which does that for you ?
    Thanks
    Harneet - struggling - Build and release Engg.

    ReplyDelete
  18. @Rebellious_jatt As changes are checked in the graph is automatically updated so that the correct tests are executed.

    ReplyDelete
  19. ADMISSION OPEN-----
    The curriculum is transacted by highly qualified faculty using world-class facilities on an 18-acre sprawling, serene and lush green Wi-Fi campus in the meadows of Gurgaon (Haryana)..
    Engineering college in Gurgaon

    ReplyDelete
  20. I can see that you are are genuinely passionate about this! great information.
    thank you...!

    http://www.hadooponlinetraining.net/

    ReplyDelete


  21. Wonderful blog & good post.Its really helpful for me, awaiting for more new post. Keep Blogging!




    Software Training Chennai

    ReplyDelete
  22. Interesting. Honestly, I'm just a website hobbyist and reading this kind of information makes me want to know more. I read a few basic HTML and CSS from an Arizona web design blog website before, but this article proves to be too advanced to what I currently know. Anyway, it still helps to be knowledgeable about this stuff. At least in my future research, I already have an Idea about this.

    ReplyDelete
  23. IT Pathshala Private Limited is an initiative of Myzeal IT Solutions and driven by a sustainable vision under which, the company is committed to provide robust and industry ready software testing training. It’s a career enabling programs to industry requirements and also professionals who are working in fast growing Software Testing industry. Our modules cover an advanced spectrum of topics. The curriculums are built by professional software developers that are available in both text and video formats.
    Key benefits of Our Software Testing Takeaways:
    • Training Provided by CMMi L3 ready IT Company
    • 100% Job assistance and In-House Job Opportunity
    • Opportunity to work upon Real On-going projects
    • Interaction with Industry Experts
    • Experience Letter on Training completion
    • Small Batches to focus on Each students
    • Certification on Training Completion

    ReplyDelete
  24. You have shared what a nice article, keep it up...........

    ReplyDelete
  25. Nice blog and thank you for the information. Awicon Technologies automation is a strong proponent of eco-friendly home automation systems which includes door locks, theater and security systems in Hyderabad.
    Home Automation in Hyderabad

    ReplyDelete
  26. This comment has been removed by the author.

    ReplyDelete
  27. The very interesting part is how do you maintain the dependency matrix!


    Radios Costa Rica

    ReplyDelete
  28. Well it was very good to see your article... Thanks a lot for providing Information
    Regarding Testing Tools.. Flaxit will provide the best Software Courses Online Training in USA.

    ReplyDelete
  29. Great article but something happened to the images. Can you please restore them? Thanks!

    ReplyDelete
  30. Where are the images? I would love to read the article with the images.

    ReplyDelete
  31. This blog is really awesome and very helpful. thanks, I hope so. that you will continue to provide such information.
    visit site

    ReplyDelete
  32. This is an informative blog. Keep it up. I am looking forward to this kind of blog. I took a lot away from this blog. also, your thoughts were very well organized as far as how you went into details and made it very. Thanks
    Gmail Nederland

    ReplyDelete
  33. This post is so interactive and informative.keep update more information...
    selenium Automated Testing Main Features
    selenium IDE basics

    ReplyDelete
  34. Hi, I want to express my gratitude to you for sharing this fascinating information. It's amazing that we now have the ability
    to share our thoughts. Share such information with us through blogs and internet services.
    Visit site

    ReplyDelete
  35. Thank you for sharing this amazing post. Looking forward to reading more.
    Visit us: dot net training
    Visit us: Dot Net Online Training Hyderabad

    ReplyDelete
  36. Hi, I want to express my gratitude to you for sharing this fascinating information. It's wonderful that we now have the ability to share our thoughts. through blogs and internet services, I felt the same way, keep sharing more posts on this side with us in the future.
    youtube bellen

    ReplyDelete
  37. The blog has been written in a manner that there isn't anything that has been left revealed, and furthermore, I have perused different web journals that are posted here and they are altogether worth a read.
    visit site

    ReplyDelete
  38. Good Article. Thanks for sharing this wonderful information with us.
    Cloud Testing Training
    Cloud Testing Online Training

    ReplyDelete
  39. I Like to add one more important thing here, The Automated Test Equipment Market is expected to be around US$ 26 Billion by 2025 at a CAGR of 5% in the given forecast period.

    ReplyDelete
  40. Amazing! Its a genuinely remarkable piece of writing, I
    have got much clear idea on the topic from this post.
    Outlook Bellen

    ReplyDelete
  41. All images are missing. Could you please re-enable them?

    ReplyDelete
Archive
Loading