Code Review Essentials

Code Reviews are an essential part of Software Engineering, providing numerous benefits for teams and the products they deliver. Having spent a significant amount of time conducting them for many years now, in this article, we will touch upon some key aspects to consider which, generally speaking, are of particular importance.

Primary Benefits

Similar to functional testing, Code Reviews provide a unique set of quality controls which help ensure standards are upheld; affording teams the ability to verify a number of critical concerns early on and within the confines of engineering specific constructs. This almost certainly yields a higher return as the time investment required to address issues at this stage requires minimal involvement across teams and functions.

Code Reviews also serve to aid in the verification and upholding of best practices, standards, and conventions across teams and within organizations. These standards can cover a broad range of concerns such as consistency, facilitation of reuse, scalability, security, optimization, readability, simplification, and any other auxiliary criteria specific to a given organization.

Additionally, the Code Review helps to confirm that requirements have been fulfilled in the context of the underlying feature being reviewed as, it is not uncommon for developers to misinterpret requirements.

Likewise, developers are generally focused on solving various small problems in a very particular and limited scope. Because of this, it is inevitable that opportunities will be missed, and oversights will be made. One of the primary responsibility of the Reviewer is to provide a holistic and broad perspective which takes into account not only the soundness of the code being reviewed, but also how it measures, complies, and integrates in the context of the larger system as a whole.

By having another set of eyes, so to speak, we arm ourselves with a very important second line of defense, as well as an agent for opportunity.

Proliferation of Knowledge

One of the most beneficial aspects of Code Reviews is the investment in overall knowledge throughout the team; and ultimately, the ROI it provides. As such, core to the Code Review is the proliferation of knowledge. This applies to both the Reviewee, and the Reviewer alike.

For the Reviewee, when areas of improvement, best practices, optimizations, abstractions and the like are outlined, an opportunity is presented for one to learn new (often improved) techniques which they may not have been aware of otherwise. This holds particularly true for more junior developers who simply have yet to acquire the experiential knowledge obtained by their more senior counterparts. By learning from the experiences of others, the Reviewee can expedite their own growth as a Developer. Here, the expectation is that, overtime, each Reviewee will have fewer and fewer of the same review comments to address as they now have a dedicated platform (even if unofficially so) from which to continually learn.

For the Reviewer, Code Reviews provide an opportunity to share knowledge and insight, while affording one the ability to obtain a broader understanding of the system in its entirety, as this knowledge is vital to providing a successful review.

Additionally, it may be necessary for a Reviewer to devise and provide solutions to problems which the may not have encountered previously and, in order to be effective, a Reviewer must be confident in the feedback and solutions they are providing. This alone affords the Reviewer themselves the ability to gain a deeper understanding of their own knowledge, while also challenging themselves in order to obtain the information necessary to do so. Thus, for the Reviewer, Code Reviews present a tremendous opportunity to not only provide value to others, but also to obtain and enhance their own value as well.

Team Cohesion

In general, developers more or less tend to work in a rather silo’d manner, primarily focusing on one particular problem space (particularly in the scope of a given feature), and only collaborating when necessitated by DSMs, meetings, or when they or another team member runs into a problem and needs assistance. While much of this is a rather natural by-product of feature development, so to, can it be said that Code Reviews naturally cultivate collaboration; thus, collaboration can be built into our processes by default.

With Code Reviews, no one Developer is ever working completely on their own. This has numerous benefits, many of which have already been outlined above, yet perhaps one of the most significant benefits is that developers are much more likely to double check their work and submit something that they can be proud of when they know someone else on their team will be reviewing their work. Likewise, Reviewers, no matter how experienced, are much more likely to validate and double check their feedback for the exact same reasons. This alone lends itself to higher quality output across the board.

Key Aspects to Consider

While numerous aspects must be considered with respect to conducting Code Reviews, generally speaking, there are common considerations which by and large tend to hold true. While certainly not an exhaustive treatise, what follows is a brief outline of those I have found to provide particular value.

Atomicity: PRs should be atomic (relatively small in nature). If PR is excessively large, it should be rejected and the engineer should be informed to break out the PR into smaller submissions (generally these smaller submissions can be merged to an intermediary branch before being merged to the intended target branch). This is crucial as the surface area for mistakes and missed opportunities is proportional to the amount of code being reviewed. In addition, requiring PRs which are smaller in scope encourages developers to think in terms of smaller units of function and subsystems, which in turn leads to clearer separation of concerns, and encapsulation. As such, it is often helpful to impose a change threshold for submitted PRs.

Compatibility: Changes should remain backwards compatible and not introduce breaking changes (unless expressly coordinated across teams). Reviewers need not checkout each PR and explicitly test each feature being submitted, rather, they should always be cautious of breaking changes, particularly in terms of APIs (e.g. argument positions changing, etc.).

Consistency: PRs must fully adhere to well documented and established standards and conventions; typically supported via commit convention tooling. This is crucial as, consistency and conformity of standards leads to a unified codebase where developers can easily work across packages and features with very limited effort as, the overall structure and coding style is consistent; making it much easier to know where everything should, and is, defined, how modules are organized, and readability is immediate as formatting and structure remains the same across packages and modules.

Clarity: All modules, functions, classes, types, etc. are always be clearly named, defined, remain properly encapsulated, and reside within a logical and appropriate location.

Readability: Readability should be favored over excessive succinctness or overly “clever” implementations which do not read well. Conversely, overly verbose implementations are to be avoided as well. It is important to remain cognizant of the fact that code is read many, many more times than it is written. Moreover, when implementations become hard to reason about, that is often a sign of a poor implementation (usually the result of a specific unit doing too far much). Succinct, yet meaningful names must always be used. Strive to ensure code is self documenting in terms of its intention.

Reusability: Implementations must take reuse into account at all times; be it abstractions to common packages, abstractions within a particular project, or abstractions within a particular scope of a project. In addition, Reviewers should always be on the look out for additions which are redundant and should be removed and replaced with existing APIs available. This includes both internal APIs, as well as third-party libraries. Always ensure native APIs are being leveraged (Array.forEach, etc. rather than explicit for loops) as well as standard third party libraries (lodash.debounce, etc. rather than custom implementations). No redundancies should be introduced, and implementations should fully utilize existing APIs, Modules, Components, etc. throughout the available packages.

Simplicity: Solutions should always be implemented in the simplest way possible. Less is more, this extends down to each line of code. Keep things as simple as possible, but no simpler.

Scalability: Implementations must be performant and optimized to an acceptable and expected level – generalized optimizations must be made, and premature optimizations should only be suggested when necessary.

Securability: Implementations must be secure, keeping standardized security measures in place and ensuring attack vectors and cumulative surfaces are fully understood, accounted for, and securely addressed.

Discoverability: Documentation and / or related tools should follow specific conventions and remain succinct and to the point. Ideal documentation should provide a meaningful, yet brief description, followed by a useful example which speaks for itself (often, unit test expectations can be used verbatim here). On a related note, sources should not contain overly verbose inline comments as well. When, for example, a function has more lines of inline comments than actual implementation code, that’s usually a sign that the code does not read well, or the developer has merely been leaving “note to self” comments. In such cases, strive to provide ways to simplify the implementation such that it achieves better readability by being self documenting.

Accountability: It is crucial that all Team members are aware of the criteria against which their code will be reviewed as, doing so essentially holds developers accountable for ensuring they not only understand what is expected, but are diligent in reviewing their own work prior to submission. Developers should be encouraged to pre-submit PRs for performing a “self review” prior to officially submitting and / or assigning a reviewer. This approach is quite valuable as it provides the developer with a high-level overview of their changes outside of the environment they have been working in, and within the context of the branch to which their changes will be integrated.

Concluding Thoughts

While there are certainly other factors to consider when conducting Code Reviews, the above considerations touch upon some of the more fundamental aspects, with the key points hopefully being apparent as, perhaps the most important trait of a successful reviewer is in one’s ability to clearly express intent while also passing this knowledge on to others.

Comments Off

Leveraging GPT to Revolutionize Workflows and Processes

In the history of technological breakthroughs, Generative Pre-trained Transformers (GPT) stand out as a monumental leap in Artificial Intelligence, with the potential to fundamentally transform the way we, as Developers, work.

This highly advanced and sophisticated AI Language Model offers a plethora of ground-breaking software engineering applications, ranging from code generation to automating complex, repetitive tasks. This article explores the concept of GPT, its various applications, limitations, and tips for optimal utilization in the context of Software Engineering.

What is GPT?

GPT, or Generative Pre-trained Transformer, is a Machine Learning model which utilizes Deep Learning techniques to produce human-like natural language text. It can be applied to a wide range of tasks, such as answering intricate questions within context, summarizing text, code generation, language translation, as well as numerous other applications.

GPT-3.5: The current version of GPT, GPT-3.5, is based on a dataset of billions of webpages, books, and text-based information (up until 2021), and contains 175 billion parameters.

GPT-4: The next release of GPT, GPT-4, is anticipated to feature a vast dataset of trillions of webpages, books, and other textual sources, and is expected to contain over 100 trillion parameters.

How can GPT be used today?

There are numerous Tools on the market that are built on GPT Technology, and, from a Developer perspective, the following outlines those which are most likely to provide the best entry point for enhancing DX.

ChatGPT: The most common entry into GPT, ChatGPT is a language model that is trained on a massive amount of textual data. This allows it to generate human-like text and respond to a wide range of prompts with impressively high accuracy. Conceptually, ChatGPT can be thought of as a successor to traditional search in that it essentially cuts out the entire process of searching, identifying relevant results, following links to those results, sifting through content, and trying to arrive at an answer. GPT eliminates this by providing answers or relevant information directly in response to questions in a natural and intuitive manner.

GPT API: The GPT API allows developers to access GPT’s capabilities via a REST API. The API can be used to generate text, translate text, and answer questions. API access is based on a pay-per-use basis, with pricing dependent on the number of requests issued and the amount of text generated. A free tier for developers to test the API is also available, as well as custom pricing for enterprise customers with high volume usage.

GPT Playground: Similar to ChatGPT, yet fully configurable and more stable, the Open AI GPT Playground allows users to experiment with the full set of GPT’s capabilities, including Model selection, introspection, and much more.

Additional Tools built on GPT: There are far too many to tools available which are built on GPT to list within the scope of this short article, however a few notable mentions are the ChatGPT – Genie AI VSCode Plugin, as well as the OpenAI NPM Package.

How can GPT Enhance Developer Experience?

While there are numerous applications for which GPT Technology can be utilized to provide an enhanced Developer Experience (DX), below are is a brief summary of a few of the most common.

Unit Test Generation: GPT can be used to generate test cases and setup, allowing developers to expedite the process of test setup, configuration, and initial test cases.

Debugging: GPT can be utilized to help debug issues in source code, identify misconfigurations, and more.

Code Generation: GPT can be utilized to generate source code, examples for specific languages and frameworks, convert source code from one language to another, and much more.

Streamlining Workflows: GPT can be integrated into development tools, such as IDEs and issue tracking systems, to automate repetitive tasks and streamline workflows.

Technical Documentation: GPT can be utilized to generate technical documents, such as API docs, design specifications, and more, thus improving the quality and accuracy of the information available to developers and teams.

Automating Repetitive Tasks: GPT can be trained to handle repetitive tasks such as scheduling builds, deployments, responding to common queries and more, freeing up engineering developer’s time for more important tasks.

Streamlining Communication: GPT can be integrated into communication tools such as Jira, Teams, etc., allowing Developers to quickly and easily communicate with team members, saving time and improving efficiency.

Identifying Patterns and Trends: GPT can be leveraged to analyze large amounts of data, such as engineering analytics, project management information, etc. to identify patterns and trends that may be difficult for humans to detect, helping Teams to make informed decisions.

Current Limitations

As a relatively new Product, certain limitations and issues are to be expected as the platform matures, namely, they are as follows.

Error Prone: GPT is regularly prone to error, and in certain cases, once an error is encountered, the conversation cannot be continued, leaving one to have to start their prompts over again within a new chat.

Accuracy and Completeness: GPT’s accuracy and completeness is often quite limited, and so it is crucial that Developers be prudent in validating outputs. Moreover, as the Model’s dataset cutoff date was in 2021, not all prompt outputs are currently relevant.

User Experience: The ChatGPT UX is lacking in many areas and doesn’t quite do the underlying platform justice. The UI is often slow and a bit disjointed; however, when it is stable, it is certainly quite usable and helps to accomplish one’s goals – this is particularly true when using a Chat GPT Plus Account.

Tips and Considerations

As with any tool, it is crucial to have an understanding of it’s capabilities and best practices in order to get the most from the experience. A few mentionable items are as follows.

Utilize Prompt Engineering: Be specific and focus on one particular topic or aspect of a topic. Resist the urge to use polite expressions such as “please”, “thank you”, etc. Instead, focus on including the necessary input required to receive the desired output.

Provide Specific Context: The more specific the information you provide to the model, the better the output will be. This can be done by providing a clear and concise, yet very specific question, including the necessary context required for the task you want the model to perform. Likewise, be mindful of ethical considerations – do not interact with ChatGPT in an unethical manner.

Be Mindful of Sensitive Information: Inputs provided to ChatGPT should always be assumed to be persisted and potentially made publicly available. Do not provide any sensitive or proprietary information, such as usernames, passwords, keys, domain specifics, or business specifics.

Validate and Verify Output: Always make sure to validate and verify received output. Never use output directly without first vetting it for accuracy, completeness, etc.

Explore the Open API Playground: Once you are comfortable using ChatGPT, try the Open API Playground, as it provides low-level access to GPT, such as switching models, configuring token length, and numerous additional configurations.

Innovative Use-Cases

While it is inevitable that there will be countless applications for utilizing GPT technology in Software Development, the following outlines some exciting possibilities on the horizon.

Application Source Ingestion and Optimization: Utilizing GPT to ingest application source code provides significantly enhanced analysis. Such integrations can create a model of an application’s data and control flow and suggest opportunities for optimization, reactively identify issues, and generate comprehensive design documentation.

Automated Code Reviews: Integrating GPT as an NLP tool to perform automated code reviews based on organization and team best practices, industry best practices, and historical data from previous code reviews can streamline the process. This can be integrated directly within IDEs, significantly speeding up existing code review processes.

Application Integration: Integrating GPT within applications can streamline help documentation, how-to guides, and augment existing features, providing users with a more seamless experience.

Enhanced API Docs: Integration within platforms can optimize adoptability via enhanced API examples. For instance, a Swagger implementation where a user simply states what they are trying to do, and instantly receives a complete example, streamlining the development process.

Conclusion

GPT offers a transformative leap in Natural Language Processing, significantly impacting developers and engineering managers by streamlining workflows, automating repetitive tasks, and providing advanced capabilities in various aspects of software development. As the technology continues to evolve, it is essential for developers and engineering teams to stay informed about the latest developments, limitations, and best practices to make the most out of this powerful AI tool.

January 3, 2023 Agile / AI / GPT / News / Software Engineering