Updated Review of LLM Based Development

I tried developing using GPT mid-2022. While I was amazed by the potential, I was not impressed enough to add it to my daily development workflow. It fell off my radar as a development tool, outpaced by a far more impactful use of text generation and image creation. A toolset that has significantly changed my day-to-day productivity.

Recently, a group of peers convinced me to give coding with LLM another shot. They loved using A.I. to develop code in languages they were not comfortable with. Or, as a manager, a way to better explain what they wanted to see in a project from their team. What convinced me to try it again was their highlighting of how well the results were formatted, syntactically correct, and well documented from the get-go. While they admitted the development of code may not be faster, the prospect of all those benefits culminating into a cleaner, well formatted final product convinced me to develop with GPT again in earnest.

I began my reexamination of the tooling via demos, as we often do. I was very impressed. I converted code into PowerShell (which I don’t know well) and re-created functionality I came across in weeks prior. I was so impressed, I showed my team examples of how the work they completed in the past could’ve been done with the help of GPT instead.

After those successes, I committed to using GPT to develop. Over the next few weeks I made sure to use it on projects I was working on.

While the technology showed incredible advancements since I tried it last year, it still hasn’t become my go-to in the same way using ChatGPT has for writing content.

Here are some areas I was both impressed with but left wanting:

Code completion
- Pro: Impressive. The look-ahead came in handy similarly to code-completion functionality of the past, with the added benefit of more contextual relevance that was not just a “cookie cutter” snippet.
- Con: It gave me a useless hint quite a bit and I found myself typing almost as much as before with the incumbent “dumb completion”. I think it is because my mind is moving ahead to what I want the code to do, not necessarily what it is doing on the console at the moment. In the end, it is using patterns to make predictions. So, any new code that is a result of changes to my approach, or my on-the-fly reworking to fix a bug (that was not due to syntax issues) took as much time to develop as non-GPT-based code completion.
Testing
- Pro: When it comes to testing an existing application, the A.I. hits it out of the park. Ask it to “write a test for myFunction() using Jest” it creates an awesome base test case that I would have hated to write for each function.
- Con: Some of the same issues outlined in the “Code Completion” and Functional Development” can be problematic here. It doesn’t always create a great test for code I haven’t written yet. (i.e. TDD) However, if the code is already there, it uses that context I’ve provided and its LLM to unpack what it the function is suppose to do and generate all the mocks and assertions needed to create a well written unit test.
Functional Development
- Pro: Much like helping me get past the dreaded blank page in text generation, I found it more useful than Google searches and StackOverflow reviews to develop a series of functions I wanted, without developing entirely from scratch. Better than code snippets, the snippets A.I. gave were pre-filled based on my prompts, variables, and existing object definitions. That was appreciated. I didn’t have to review the documentation to tweeze out the result I wanted. The A.I. pulled it all together for me.
  Additionally, the fact that it almost always has an answer goes under appreciated in other reviews I’ve read. The part that makes it so advanced, is it fills in a lot of grey area even if I (as a stupid human) carelessly leave out an instruction that is critical in generating a solution. If I were to get the response, “could not understand your request” due to my laziness, I would never use it. The assumptions it makes are close enough to my intent that I am either using the solution, learn a new path, or see what my explanation is missing so I can improve how I communicate with it.
- Con: The end result did not work out of the gate most of the time. Sometimes it never got it correct and I had to Google the documentation to figure the issue. This was due to what I think was more than one documentation existing for various versions of the library I was using. I’m not sure. While the syntax was correct, the parameters it assumed I needed, or the way the calls were made to interface with a library/API led to errors.
Debugging
- Pro: Per the “functional development” points above, I was impressed at how I could respond to a prompt result with “I got this error when using the code above: [error]”. It recognized where it went wrong, and attempted to rewrite the code based on that feedback.
- Con: Each response had a different result than the original. So, instead of fixing what I found was wrong (like a param missing) it also added or removed other features from the code that were correct. This made the generated result difficult to utilize. In some cases, it could never understand the issue well enough to generate working code.

One limitation I am not too surprised about, and am hopeful to see evolve in the future, is the AI’s understanding of a project in its entirety. Done so in a way that context is used in its function creation, making the solutions it provides “full stack”. Imagine a Serverless.com config, for an AWS deployment, that generates files and code that creates and deploys workflows using Lambda, DynamoDB, S3 and so on, all being developed based on prompts. With the yearly (and more recently) weekly leaps, I don’t think we are to far away.

As of today, I find myself going to GPT when filling in starter templates for a new project. I find it’s a much better starting point than starting from cookie cutter function as I set up my core, early, “re-inventing the wheel”-type, skeleton.

For example, I will use a Gitlab template for my infrastructure (be it GL Pages, Serverless, React, nodejs or Python and on and on), then fill in the starter code an tests via a few GPT prompts, and copying them over. Beyond that copy, I find myself detaching from GPT for the most part, and returning to occasionally “rubber duck” new framework functions.

Examples referenced above

Share:

Related

Leave a comment Cancel reply