Categories
Artificial Intelligence Automation Autonomous Agents Business Deep Learning GPT4o Information Technology Reinforcement Learning Reinforcement Learning Strategy Technology Technology Strategy

Navigating the Future with Generative AI: Part 4, Unstoppable AGI and Superintelligence?

AGI and Superintelligence 1

1. Connecting the Dots Between Two Life-Changing Milestones for Humanity

In a Times Magazine interview, Yann Lecun remarked, “I don’t like to call [it] AGI because human intelligence is not general at all.” This viewpoint challenges our common understanding of Artificial General Intelligence (AGI) versus the supposed limitations of human intelligence. The term “artificial general intelligence” itself seems overused and often misunderstood. While it initially appears intuitive, upon closer examination, nearly everyone with an informed perspective offers a different definition of AGI.

The fog only thickens with Ilya Sutskever, Chief Scientist behind the wildly popular GPT generative AI model. In an MIT Technology Review interview, he states, “They’ll see things more deeply. They’ll see things we don’t see,” followed by, “We’ve seen an example of a very narrow superintelligence in AlphaGo. […] It figured out how to play Go in ways that are different from what humanity collectively had developed over thousands of years. […] It came up with new ideas.”

Before DeepMind’s AlphaGo versus Lee Sedol showdown in 2016, we had IBM’s Deep Blue chess victory against Garry Kasparov in 1997. The unique aspect of these AIs is their mastery within a single, specific domain. They aren’t general, but superintelligent—surpassing human capability—within their respective areas.

In this article within the “Navigating the Future with Generative AI” series, we’ll explore two inevitable stages in humanity’s future: AGI and Superintelligence.

2. Defining AGI: What Do We Really Mean?

Numerous definitions exist for what we call AGI and superintelligence. These terms often intertwine in contemporary discussions around artificial intelligence. However, these are two very distinct concepts.

Firstly, AGI stands for Artificial General Intelligence. This signifies a state of artificial intelligence built upon several building blocks: machine learning, deep learning, reinforcement learning, the latest advancements in Generative AI and Imitation Learning algorithms, and basic code. These all contribute to a level of versatility in task execution and reasoning. This developmental stage of synthetic intelligence mirrors what an average human can achieve autonomously in various areas, demonstrating a generalized capability to perform diverse tasks.

These tasks stem from a foundation of knowledge—akin to schooling—combined with basic learning for completing new, periodically defined objectives to achieve specific goals. These goals exist within a work setting: finalizing an audit ensuring corporate compliance with AI regulations, ultimately advising teams on mitigation strategies. Conversely, they exist in daily life: grocery shopping, meal preparation for the next day, or organizing upcoming tasks. This AGI, working on behalf of a real human, benefits from globally accessible expertise. These attributes enable assistance, augmentation, and ultimately, complementation of everyday actions and professional endeavors. In essence, it acts as a controllable assistant: available on demand and capable of executing both ad-hoc and everyday tasks. The operative word here is general, implying a certain universality in skillsets and the capacity to execute the spectrum of daily tasks.

I share Yann Lecun’s view: a key missing element in current AI models is an understanding of the physical world. Let’s be more precise:

  • An AI requires a representation of physics’ laws but also an operational model determining when these laws apply. A child, after initial stumbles, inherently understands future falls will occur similarly, even without knowledge of the gravitational force field. They can learn, sense, and anticipate the effects of Earth’s gravity. Similarly, our bodies grasp the concept of weight calculation without comprehending its mathematical expression before formal learning.
  • Beyond this world model, an AI needs to superimpose a system of constraints, continuously reaffirming the very notion of reality. For example, we understand that wearing shoes negates the feeling of the hard ground beneath. Our preferred sneakers, due to their soles, elevate us a couple of centimeters, offering a slight cushioning effect while running. We trust the shoes won’t detach, having secured the laces. We vividly recall fastening those blue shoes before beginning our run as usual. Most importantly, we possess the unshakeable belief we won’t sink into the asphalt, knowing it doesn’t share mud’s consistency. Thus, we can confidently traverse our favorite path, striving for personal satisfaction, aiming to break that regional record.
  • An AI needs not only the ability to plan but also the capacity to simulate, adapt, and optimize plans and their execution. Recall your last meticulously planned trip. Coordinates meticulously plotted on your GPS, you set off with time to spare. But alas, the urban data was outdated, missing the detour at the A13 freeway entrance. Then, misfortune struck: an accident reported on the south freeway, traffic condensing from three lanes into one. Stuck in a bottleneck, only two options remain—pushing forward in hope or finding an alternate route. Checking your watch: 23 minutes left to reach your destination. This is how dynamic and complex planning a task can be. And yet, humans are capable of handling this all the time.
  • An AI requires grounding in reliable and idempotent functionalities, echoing the foundation of classical computing: programming, logic, and arithmetic calculation. The ability to call upon an internal library, utilize external APIs, and perform computations is paramount. This forms the basis of real-world grounding, maintaining “truth” as the very infrastructure of AGI. It’s about providing an action space yielding predictable, stable results over time, much like the verified mathematical theorems and laws of physics backed by countless empirical papers. Take, for instance, the capacity to predict a forest drone fleet’s movements using telemetric data, factoring in wind speed and direction, geospatial positioning, the relative locations of each drone and its neighbors, interpreting visual fields, and detecting obstacles (trees, foliage, birds, and so on).
  • An AI have to capitalize on real-time sensory input to infer, deduce, and trigger a decision-action-observation-correction loop akin to humans. For instance, smelling smoke immediately raises an alarm, compelling us to locate the fire source and prevent potential danger. Smartphones, equipped with cameras and microphones, display similar capabilities. Taking this further, devices like Raspberry Pis, when combined with diverse electronic sensory components, can even surpass human sensory capacities. Consider a robot with ultraviolet, infrared, or ultrasonic sensors, allowing it to “sense” things beyond our perception. This lends literal meaning to Ilya Sutskever’s statement.

This implies that AGI won’t necessarily be beneficial or provide significant added value in highly specialized fields, especially in areas where humans have been traditionally adept. This applies to domains like fundamental research, inventiveness, and engineering design – areas I believe will remain constrained by the currently available knowledge pool on the internet. This limitation arises because AGI’s continued advancement is largely driven by companies tailoring it to their specific expertise, often regarded as intellectual property.

Thus, we progressively journey towards AEI: Artificial Expert Intelligence. This translates to a model or agent, a pinnacle expert in its field. Imagine an AEI on par with the top 5% of experts (> 2σ) on this planet, reaching Olympian levels, like AlphaGeometry and AlphaProof, who secured the Silver Medal at the International Mathematical Olympiad.

The architectures with the most potential rely on active collaboration between expert models (Mixture of Experts) and between agents (Mixture of Agents). Even when individual model performance within this collaborative framework isn’t the absolute best, the collaborative outcome exhibits a quality level on par with, if not exceeding, that of the best individual models like GPT4-o. It’s a striking testament that collaboration, be it human or artificial, remains the most effective avenue to reach any objective.

3. Humanity’s Inevitable Ascent Towards Superintelligence

Revisiting the human versus machine narrative, 2018 marked a pivotal encounter: AlphaStar versus TLO (Dario Wunsch), then MaNa (Grzegorz Komincz), two professional gamers from the renowned StarCraft Team Liquid. Created by Google DeepMind, AlphaStar is a digital prodigy trained on the collective experience of 600 agents, equivalent to 200 years of playing StarCraft.

Consider the inherent imbalance when directly contrasting human capabilities against those of AI:

  1. Replication Capacity: AIs can be copied indefinitely.
  2. Relentless Training: AIs train ceaselessly, needing no sleep, nourishment, or breaks.
  3. Absolute Focus: AIs exhibit unwavering concentration on their designated tasks.
  4. Self-improvement through concurrent learning: AIs hone their abilities by training against their evolving intelligence, devising novel strategies to secure victory.
  5. Linear scalability: the more computing and memory resources you add, the greater the performance

The outcome: an AI consistently outmaneuvering the crème de la crème of a strategic open-world video game’s premier league. And as if that weren’t enough, it maintains its position within the Grandmaster league.

Here lies the very essence of an intelligence surpassing human decision-making abilities within a similarly vast and dynamic environment: this is what we classify as Superintelligence, or ASI.

Superintelligence, from my perspective, transcends mere human intelligence and even surpasses collective human intelligence. It indicates that even a group of individuals, regardless of their combined expertise and knowledge, would be outpaced, left trailing by an artificial intelligence capable of going beyond their cumulative potential.

Imagine instead a new form of synergy: a “super” human system collaboratively engaged in highly cognitive functions with this Superintelligence. This involves humans directing or, perhaps more accurately, guiding this Superintelligence based on our needs. While this Superintelligence operates with its own raison d’être, it wouldn’t clash with the fundamental purpose of humanity. This Superintelligence possesses access to those superior functions—understanding the universal model within which humanity exists. It possesses the model of reality itself.

Moreover, it resides within a self-improvement and discovery paradigm, continuously unveiling novel operations, new paradigms, and potentially even new forms of energy. Think entirely new physics laws that govern our universe; laws that humans, as of yet, have not uncovered. This encompasses diverse domains: medicine, engineering, revolutionary material science, new composite development, and engineering breakthroughs for unprecedented construction methods. Envision a symbiotic relationship between humans and machines fulfilling humanity’s ambitions. The limitations posed by individual human existence or the current state of collective human intelligence dissolve; no longer a barrier, it morphs into an expansive vision of human evolution, a potential accelerator for progress.

It even prompts new questions: How far can humans evolve? Or more precisely, how quickly?

However, we shouldn’t discount the possibility that artificial Superintelligence won’t be seen—or won’t see itself—as a novel species.

Therefore, being as rational as possible, we cannot accurately predict if this species would afford humanity the same compassion and civil collaboration that we strive for with our fellow human beings. It’s even plausible that they won’t hold any particular regard, instead pursuing their objectives, much like we think little of stepping on ants while daydreaming in a beautiful landscape, lost in contemplation, our thoughts oscillating between everyday worries and future aspirations.

4. What Would Constitute Human Superintelligence?

Human superintelligence embodies the culmination of all accumulated knowledge, discoveries, experiences, and yes, even the mistakes made by our ancestors to this point. Ultimately, this human superintelligence represents the collective “us” of today. It’s what fuels our intricate logistics and supply chains, our relentless pursuit of natural resources. It underpins our scientific endeavors: from breakthroughs in biology, mathematics, and agriculture, to understanding our global economic system – allowing us to manage our resources effectively, allocate them efficiently, and strategize our reinvestments. Money, in this light, transforms into a socio-economic technology.

Essentially, when comparing human superintelligence—today’s collective human intellect—with artificial superintelligence, a stark contrast emerges in their evolutionary cycles. Artificial intelligence advances at a significantly faster pace, powered by recent breakthroughs in training using our data. This data, importantly, reflects our findings, the mirror to thousands of years of human advancement accessible through the internet. This hints that artificial superintelligence would evolve at a much faster rate than humanity itself.

This rapid advancement stokes anxieties about potential disruption within the job market. Tech titans like Sam Altman advocate for Universal Basic Income (UBI) as a safety net for those displaced by artificial intelligence or robotics, allowing individuals to meet their basic needs even after losing their jobs. At that juncture, work itself detaches from its traditional role: that direct link between labor, contribution to the value chain, recognized worth, and societal standing. Instead, we confront the image of an economic umbilical cord, individuals sustained by the state-funded by fellow citizens.

While I remain undecided on my stance regarding UBI’s necessity, it compels contemplation. When UBI becomes a reality for a significant portion of the population, what function does money truly serve within our society? How do we sustain work motivation beyond “earning a living” when basic needs are met without active contribution? What ripples will be felt throughout a sovereign currency? Will the collective of people continue to control the economy, or is the future in the hands of AI-driven megacorporations?

There are so many answers yet to be uncovered.

After all, maybe “computing” should be considered a universal right. Therefore, we would shift the focus from UBI to UBC, Universal Basic Computing.

5. AGI and Superintelligence: Steering Toward a Future of Abundance or Ruin?

The next cycle hinges on resource accessibility and access to “programming” the world. Initially, artificial intelligence, at the very least, will permeate our daily lives. We are transitioning to personalized AI assistants, specializing in our chosen pursuits, whether robotics for errands, learning assistance for mastering a new language, or perfecting one’s singing voice. Next to none, specialized AI coaches will emerge to achieve elite athletic status, along with AI tutors guiding our artistic development beyond the readily available generated art of today.

Simultaneously, this superintelligence would be managing our complex systems: national infrastructures, electricity grids, vast transportation and logistical networks. Thus, it can drive early warning systems for natural disasters or power next-generation weather prediction platforms that incorporate oceanic currents. It will even account for stellar events such as shifts in the sun’s activity, factoring in our solar system’s dynamic positioning.

In conclusion, these are just glimpses into the potential futures shaped by AGI and superintelligence. However, the core message remains: we stand at a critical juncture. Depending on our collective appetite for progress, we could be headed toward a future of abundance or stumble along the path toward our own undoing.

Science offers an incredible opportunity: the chance to break free from a civilization driven by profit-motivated conflicts and ideological clashes. Instead, it enables collaboration guided by a neutral, third-party entity—one that embodies the best of what we, as a species, have strived for, built, and imagined. This collaboration offers a path for our societal framework to truly evolve.

The future is bright if we make it right.

🫡

Categories
Artificial Intelligence Business Businesses ChatGPT Engineering EU AI Act GPT4 GPT4o Information Technology Innovation Llama Meta OpenAI Regulation Technology

✨ Llama 3.1, Meta and the EU AI Act – Where are the areas of synergy between innovation and regulation?

img 20240727 wa00033116017234937086313
Llama 3.1 AI model

Llama 3.1, a 405 Billion parameters model, has just been released by Meta.

It comes with increased performances. Some early tests make it comparable to “GPT4o“.

A few perks:

  • Still #opensource
  • 128K token context window
  • Improved Multilingual Support. Meta is a leader in multilanguage models.
  • Comes with a new security and safety tool for advanced moderation and control mechanisms to ensure safe interactions.
  • Improved capabilities for creating synthetic data.

I find the partner ecosystem, including NVIDIA, Google Cloud, Microsoft, Groq supporting Llama already quite impressive (see picture).

But also…

While the EU AI Act has been officially published on July 12, 2024, in the EU official journal, to come into force on August 2, 2024, Meta made worrisome news for the #artificialintelligence open source community.

In a nutshell, Meta will withhold the rollout of multimodal AI models in the EU region until the regulatory rules are clarified.

The EU AI Act contains explicit rules for foundation models, also known as “general-purpose AI models”, amongst the following:

  • Article 51: Classification of general-purpose AI models as general-purpose #AI models with systemic risk
  • Article 53: Obligations for providers of general-purpose AI models
  • Article 55: Obligations for providers of general-purpose AI models with systemic risk
  • Article 56: Codes of practice

Let’s hope we will find a way to balance #innovation and #regulation.

🫡

Categories
Technology Artificial Intelligence Business Innovation personal development Questioning Self development Wisdom

The Art of Questioning: How One Well-Crafted Question Can Significantly Improve Your Life (and Everyone Around You)

“There is no dumb question, only dumb answers”

This is what my teacher told me when I was in primary school. This simple yet powerful sentence cancelled my fear of looking stupid when asking questions whenever I felt the need to.

Today, in my leadership position, I regularly ask questions for various reasons. The primary motive is that leading within the realm of excellence requires understanding the situation and bridging it with objectives you must reach. Then, it is about delivering a simple message, built on the why, how and what so that your peers understand the mission and have all the elements to make it happen. Everything happens for a reason, my responsibility lies in exposing the reason, taking decisions, and consequently triggering actions.

The second motive is that I need to understand the depth of systems and I simply LOVE acquiring knowledge. This fuels my motivation. It is like filling an endless toolbox that helps me to invent, build, compose, and innovate. I guess it is the scientist and the engineer parts in me. And I am still curious about life as a 4-year-old kid. All innovation starts with the same question: ‘How do we solve this problem?’ This question launches startups and is the mother of all technological advancements.

The sentence I use the most is « I don’t understand, can you explain, please? ». it is about being aware there are a variety of things to discover, then there are many different points of view, outmatched by ways of thinking. What fascinates me the most is when people share their experiences. I am also conscious that my interlocutor’s sentences and my very own understanding are two sides of the conversation. The point is, that it’s not always smooth communication.

It is about finding the right channel and angle.

It is about finding the right tempo and timing.

In this situation, I use the almighty Question Mark, a weapon created eons ago as powerful as Thoth’s Caduceus.

“I am just a child who has never grown up. I keep asking these ‘how’ and ‘why’ questions. Occasionally, I find an answer “

Stephen Hawking

Yes, the question mark, a single-character sign, possesses this power. Just like a math unary operator, it is a linguistic operator that empowers you with multiple abilities. And here they are:

Expanding knowledge by the means of asking a question. The brain has the habit of filling the gaps of ignorance using analytical skills, an association of concepts, and similarities. While it works more often than one thinks, it is inaccurate, even misleading sometimes. By gaining more awareness about this innate mechanism, we tend to cover gaps by following the foundational maxim, “If you don’t know, ask”. The Bible said it in another way “Ask, and thou shalt receive.”.

Assert or confirm a statement by seeking a true or false answer, increasing the force of your internal knowledge system. English has this nice reverse interrogation formulation where the auxiliary and the pronoun are inverted. For example: “The dog was cleaned, wasn’t it?” or “The Internet is the most powerful network, isn’t it?”. Another technique consists of using “be” as a tool for opening the field of possibilities for answers, meaning one can respond by confirming or infirming with a piece of complementary information. For example: “Was it Descartes who once said ‘I think, therefore I am’?”. Here, I tend to see the question mark as an invitation to conclude a contract of trust: my understanding is acknowledged; therefore, I trust your words. 

It forces action through suggestions. When one says “You are going to do it, aren’t you?”, you give that push that will trigger the intended cascade of events.

You can use a question mark to invite someone to do something or join you. For instance, “If you have time, would you like to have dinner tonight?”. This is a gentle way to request someone to do something of their own will.

It demonstrates your genuine interest in what someone is saying or thinking. For example, when someone asks a question during a debate, it constitutes a trigger indicating a need to be filled or a bond to be made. It is also an expression of interest in what she or he could say. In this regard, it is an invitation to participate in the idea exchange.

Did you know your questions are so valuable they have built business behemoths?

In 1997, Larry Page and Sergey Brin founded Google. Interestingly, the original name was “Googol,” but, due to a mistake, the company was registered as “Google.” A googol is a number with a hundred zeros.

Fast forward to today, and Google is the undisputed king of search engines, holding a 91% market share. There are 85 billion queries per month on Google in 2024.

Notably, the second most popular website is YouTube, which also relies on finding the right video based on your searches and attention. Google, YouTube, Instagram, Pinterest, X, and TikTok all share a common revenue source: advertising. Google popularized keyword bidding for ads: the more directly a keyword is associated with a common search, the more expensive it becomes. That’s how your questions acquire monetary value and become the fuel of social media super-algorithms.

Now that we are talking about money, it reminds me of the book “Rich Dad, Poor Dad” by Robert Kiyosaki. There is a striking example of how the question “How can I buy this house” instead of the affirmation “I cannot buy this house”, shifts the way your brain operates. Asking yourself the right question, rather than making an affirmation, triggers a thinking process that induces self-motivation. Facts are static until they get shaken; questions are the beginning of a story. They represent the spark.

While teaching my children how to use the Internet safely, I have rediscovered the most powerful effect of the question mark: the power to plant the seed of an idea with a drop of water, and the new beginning of the dark ages. It is the rise of click baits and misleading titles in online newspapers that makes me wonder “Why do people click?”, “Why do I want to click?”, “What is the true intention of this article? ». By turning an idea into a question, there is no assertion.  This is a state where neither the truth nor the lie is told but induced. You express your opinion in the most open way possible, and you activate the inner functions of the human psyche. A mind will naturally try to fill gaps whenever information is missing to either decide, lead one toward a goal, or fill a knowledge hole. This process is just an innate mechanism of the brain. You should know that a lot of content is now automatically generated by artificial intelligence, designed to exploit these psychological mechanisms for traffic and revenue or to trap users with malware, which is more harmful. The intent is to hack your mental system. One must be aware of this and should always ask the question « What is the true intent of the authors ». Then, what’s the difference between convincing and manipulating?

If you have yet to see the movie Inception by Christopher Nolan, I highly suggest watching it.

By asking a question, you can present your idea gently and say, « I may be wrong, and I am open to suggestions. Let me tell you what you think ». On the positive side, when the intent is pure, it is a form of expression that is polite and smooth. It reminds me of the precepts of “Nonviolent Communication” by Marshall Rosenberg. On the negative side, you can exploit people’s minds and practice the art of manipulation. 

Questions create cues, which are powerful mechanisms to awaken your mind and provoke a reminder to act or recall a memory. I have discovered that expert project managers use this technique, either willingly or subliminally, to prompt actions and increase the likelihood that the flow of work and cadence stays high among the various actors involved in the project. Sometimes, great achievements are born from seemingly insignificant pushes.

A well-architected set of questions creates a frame of thought. This frame acts as a guide, helping you to build reasoning. This reasoning leads to other questions and decisions. Ultimately, it creates an outcome that you can use, like a design, a plan, or an analysis, learned, or taught. This set is even more powerful than a task list, even though it requires more effort. I used this technique for building the AMASE Product and Service Architecture Canvas.

Beyond questions lies the dynamics of knowledge flows.

For those familiar with hermetic principles, you’ll recognize the profound power of questions. I view questions as channels of feminine energy – not in terms of gender or sex, but as conduits drawing information, knowledge, and wisdom from one place to another. This energy is then transformed into action, thoughts, decisions, or stored as knowledge.

Consequently, speaking the truth and sharing what you know (outside of storytelling contexts) is a principle one should adhere to. To do otherwise risks poisoning minds, as discussed earlier.

Interestingly, as a question receiver, you’re invited to answer or act: essentially, to give energy. Have you ever been in a job interview? That long series of interrogations is exhausting. With this in mind, you’ll understand why the pace and sequence of questioning, followed by active listening, enhances the quality of an interview or conversation.

To push this reflection further, I see wisdom as a higher state of energy that one can maintain effortlessly. Wisdom is a dense ball of useful energy – the accumulated and structured knowledge and know-how – from which one can tap and maximize benefits. The wise can passively nurture and grow this body of wisdom like a supernova. This is the path of the Sage.

TED Talk‘s motto is “Ideas worth spreading.” We’ve discussed how to uncover such ideas, but an equally important question is: how do we spread specific pieces of knowledge that are worth sharing?

Once again, questions are tools. I know an executive who employs this tool skillfully to align their staff and spotlight individuals who deserve recognition, especially in front of other executives. Questions spread easily like ideas because they open mental gates and reposition worldviews by offering new perspectives. The expression “Hmm, I never thought about that” often originates from a well-posed question!

The 2nd dimension of questioning

“I don’t want to be a didactic voice. I like to ask more questions than I answer, just to get people thinking and to make it safe to access art”

Hannah Gadsby

People use a secondary dimension as we speak to fuel the question marks’ superpower: the tone. The tone gives more depth, and more information, to communication.

A tone that ends with a high pitch will induce emptiness, which is to be filled with knowledge

A tone ending with a low pitch will induce completeness almost filled with willpower

It can be warm and calm so that your audience has fertile ground for inner reflection and trigger a deep-thinking process

Another way is to use a more seductive tone to induce something without saying it. Like the previous superpower, it is to be used with genuine intention because it makes people believe something. Therefore, it is better to be true; otherwise, you are just abusing someone’s dreams or weaknesses.

Finally, one can use a more threatening tone. The purpose of this is to express a « one last chance ». Your counterpart is put in the corner, and she or he will have to make a choice, for which the result may be costly if it is the wrong choice. As the former heavyweight boxing champion, Mike Tyson, said: « Everybody has a plan until he gets punched in the face ».

I realized that, like laughter, tone signifies the same meaning in most languages. Perhaps the truth lies here: sound imbued with emotion is the universal language, don’t you think?

To wrap up, the Art of Questioning can be synthesized into five key aspects:

  1. The art of quickly and clearly mapping the realms of the unknown and unclear, while making their boundaries tangible for everyone.
  2. The art of unearthing the truth.
  3. The art of carving an idea to absolute clarity.
  4. The art of guiding, influencing, and (unfortunately) manipulating.
  5. The art of mastering the flow of knowledge.

Can we consider questioning a science? I asked myself this and discovered there is, indeed, a field of research dedicated to ‘questioning,’ as well as a nascent discipline named ‘Questionology.’.

With that, I’ll leave you with one final piece of advice: With practice and curiosity, awareness leads to wisdom. Use the question mark’s power wisely.

🫡

Yannick HUCHARD

Categories
Artificial Intelligence Automation Business Business Strategy Engineering Innovation Robots Strategy Technology Technology Strategy

Update on Tesla’s Optimus #Robot – it is progressing fast

Tesla’s Optimus Robot learning from humans

The most impressive part is the technique employed by the Tesla team for accelerating the robot’s dexterity: the robot physically learns from human actions. 

Now, let’s step back and analyse Tesla’s master plan here:

(Putting on my business tech strategy goggles) 

1. Tesla builds electric cars augmented with software programmability.

2. Tesla provides an electric grid as a service.

3. Tesla builds gigafactories that maximize the automation of car manufacturing. Almost every single part of the pipeline is robotized and optimized for speed of production.

4. Tesla builds Powerwalls (by providing energy storage, it also creates a decentralized power station network).

5. Tesla brings autonomous driving (FSD) to Tesla cars. Essentially, cars are now transportation robots governed by the most advanced AI fleet management system.

6. Tesla builds its own chips (FSD Chip and Dojo Chip)

7. Tesla builds its own supercomputers.

8. Tesla launches Optimus, which aims to replace the human workforce in factories and warehouses.

9. X.ai, which has recently raised $6 billion, X’s supposedly “child” AI company, brings the Grok AI model trained on X/Twitter data. While you may say X data is not the best, X has a algorithm balanced with human judgment (community notes), AND the company regroups the largest set of news publishing companies. Basically, it automates curation and accuracy.

10. A version of the Grok AI model will likely power Optimus’s human-to-robot conversational interface.

11. Tesla cars will be turned into robotaxis, disrupting not only taxi companies but also Uber (the Uber/Tesla partnership may not be a coincidence), and eating into the shares of Lyft and BlaBlaCar.

12. Tesla will enter the general services business, and retail industries to offer multi-purpose usage robots – cleaning services for business offices, grocery stores, filling the workforce shortage in the catering (hotel-restaurant-bar…) industry, etc.

Tesla is not the only one moving in the “Robot Fleet Management” business. Chinese companies like BYD (EV) offer strong competition, and there are several robot startups (like Boston Dynamics and Agility Robotics) racing for the pole position.

#AI #artificialintelligence #Robotics #Optimus #EV #software #EnergyStorage #Automation #powerwall #AutonomousVehicles #FSD #chips #HighPerformanceComputing #Robots #GrokAI #NLP #robotaxis #innovation #WorkforceAutomation

Categories
AR/VR Augmented Reality Innovation Mixed Reality Technology UX Virtual Reality

Apple Vision Pro – I Thought I Knew What The Metaverse Would Feel Like. I Couldn’t Be Further From The Truth.

A couple of weeks ago, I received an unusual meeting invite. It said “Test Apple Vision Pro.” I read it twice and jumped at the opportunity. I had been longing to get my hands on an AR/VR device that could make my dream idea – an augmented world (project Vmess platform) – a reality. That day was finally coming.

What better way to cap off an amazing work week at Banque Internationale à Luxembourg (BIL) than by getting up close and personal with Apple’s groundbreaking #mixedreality marvel – the #VisionPro? Last Friday, I had the immense privilege of taking this pioneering device for a spin.

Let me be blunt: before trying the Vision Pro, I thought I had a decent idea of what the metaverse experience would be like. But I couldn’t have been more wrong. This isn’t just the future – it’s a portal to parallel universes that shattered my expectations.

The Vision Pro isn’t a smartphone replacement; it represents an entirely new frontier, a mind-bending window into the so call #metaverse. Furthermore, everything is at hands: you pinch to interact. like every Apple creation, it exudes sophistication down to the finest detail. 

The display resolution? Words fail to capture its otherworldly crispness and depth. And we’re not merely talking apps here; these are full-fledged, multi-sensory experiences that transport you to realms you thought only existed in science fiction.

Mark Zuckerberg was certainly onto something with his metaverse vision, but Apple seems poised to leapfrog everyone with this staggering delivery that must be witnessed firsthand. 

My rendezvous with the Vision Pro was more than a tech spectacle, though. It was also a heartwarming reunion with the brilliant minds at Virtual Rangers. Their #VR app portfolio is impressive, but what moved me most was “Roudy’s World” – an experience lovingly crafted to inspire hope and joy in children facing unimaginable adversity.

Immense gratitude to Matthieu Bracchetti and the entire Virtual Rangers crew, along with François Giotto, for making this future-altering experience possible. The metaverse future we yearned for? It’s already here, and it’s far grander than we ever conceived.

#augmentedreality #virtualreality #artificialintelligence #ai #digital #innovation #tech2check #digitalaugmentation

Categories
web architecture Artificial Intelligence Automation Autonomous Agents Information Technology Services Technology User Experience UX

Navigating the Future with Generative AI: Part 3, Building the AInternet – AI, Web, and Customer Experience

A Revealing Experience

Allow me to share a personal experience that perfectly illustrates the challenges I will discuss. I was involved in a car accident where a vehicle coming from the opposite direction severely damaged the right side of my car. Following the procedure, I filed an accident report with the other party, although I found myself unable to provide my insurance number simply because I didn’t have it readily available at that moment.

In the meantime, I went to my regular dealership so that an appraisal could be carried out and the next steps for repair could be determined. I then contacted my leasing company, and one of their agents agreed with me and the dealer that I would drop off the vehicle within two weeks. A replacement vehicle would be provided, and the full repair would take one to two weeks.

However, due to my lack of foresight, I did not deem it necessary to contact them again initially. A few days later, I received a letter from them informing me of the accident – which was correct – but also stating that I had not submitted the accident report and that without it, their insurance reserved the right not to cover the damages. In fact, I had sent this document a week earlier, but to the wrong email address. Out of habit, I had used their general contact details, avoiding contacting the agent in charge of my leasing file – who had recently retired. As a precaution, I had even added the generic address, but clearly without success since the insurance department had not received it.

I then called them back urgently to obtain clarification. They confirmed that the accident report was missing, and the agent, with great understanding which I acknowledge, told me that I had to send it to another specific address because the insurance department had not been notified by their colleagues in charge of customer relations. Moreover, the latter was not authorized to provide me with a replacement vehicle until the repair shop had received their approval – even though it was the approved dealership where I had been carrying out all maintenance operations for years.

This kind employee then offered, as an exception, to handle my entire case without further difficulty since the drop-off of my vehicle was imminent, just a few days away. She knew also that my leasing contract was expiring and that I would have to return the vehicle in two weeks to obtain a new one.

While this situation caused me a little stress, it was only temporary. An hour later, the agent contacted me again to confirm that everything was settled: I could bring my vehicle the following Monday and a replacement vehicle would be provided for the duration of the repairs.

Lessons from This Experience

You may be wondering why I am sharing this story with you.

First of all, I was unaware of the procedures governing the reporting of an incident in the context of a leasing contract. Should I first contact my company, directly the historical leasing company, or the new one? When I called them, why didn’t I reach the dedicated claims and insurance department directly? Why didn’t I find any information about this on their website? Why, when everything seemed clear to me – that I would drop off my vehicle within two weeks, that a replacement vehicle would be waiting for me, and that the repairs would be handled smoothly – did things unfold differently due to a lack of following the proper procedure?

Beyond that, how can a single service company exhibit such a lack of communication between two complementary departments?

The Revolution of the “AInternet”


We are entering a new era where artificial intelligence will be at the heart of exchanges between human beings. Where everyone previously had to search for information themselves on the Internet, navigating from site to site and compiling data to find a company’s contact details, the instructions for a recipe, the contacts of a repairman, or browse the Yellow Pages, the new paradigm will rely on exchanges between humans, intermediated or not by an artificial intelligence capable of performing synchronous or asynchronous tasks, i.e. in the background, to provide immediate knowledge to the user rather than forcing them to seek it out.

And to return to my use case, the AInternet brings a revolutionized customer experience that unfolds as follows:

When I am involved in an incident, I ask my personal AI assistant to help me fill out the accident report digitally. I do not have to provide all the information since my assistant has a global context encompassing data related to my vehicle, its insurance, my contract, my identity card, my passport, my postal and telephone contact details, my insurer, the maintenance status of my car, its technical inspection certification, etc. All this information allows for automatic and complete filling of this type of interaction.

Next, I only need to ask my assistant to contact the assistants of my leasing company and my insurance company, to ensure that the report I have validated and electronically signed is transmitted and processed by these two parties.

The assistant of the leasing company then informs the agent that a replacement vehicle is required and that an approved garage must be contacted to book an appointment for the repairs. It also determines whether my car should be taken directly to the dealership in charge of its regular maintenance. The relevant agent then handles my vehicle accordingly.

The agent only has to ask their assistant for the contact details of my garage to reach out directly.

From there, a genuinely empathic human relationship is established as we build a frictionless mutual understanding of the situation. Following the garage’s preliminary appraisal report, the leasing agent and the garage are prepared to agree on an appointment date, which is then recorded in the various systems.

The garage proceeds in an automated manner with the reservations and orders for the spare parts necessary for the repairs.

Simultaneously, the leasing company manages with the insurance company all the steps required to allow for the vehicle reparation and the provision of a replacement vehicle during the downtime.

Finally, the agent contacts me personally, by phone or message on a platform such as WhatsApp, to confirm everything is in order:

The incident has been properly recorded and the insurance company will cover all costs. An appointment has been set with my garage. A replacement vehicle will be provided during this period. An estimated date for returning the repaired vehicle has been communicated. They wish me an excellent day with a smile, since their assistant and mine have handled the entire procedure seamlessly. This augmented interaction allows us to reach new heights of fluidity and ubiquity in exchanges.

I am optimistic, indeed. Why wouldn’t I be? The transformation is already in motion.

The Internet will no longer be confined to a vast catalog of information to consult, such as books, encyclopedias, or applications, where interactions must be initiated and orchestrated by us, humans. But the orchestration between an individual and an organization, between two individuals, or between an organization and a computer system, will be performed like a symphony by intelligent agents, artificial intelligences.

This demonstrates an evolution of the World Wide Web architecture, which will constitute a veritable system of systems composed of human beings, applications, automata, and artificial agents.

The challenge from now on to enable this progression towards the era of digital augmentation will be to build artificial intelligence at the heart of human interactions. It is a matter of UX innovation.

It will no longer be a question of programming these interactions in advance by limiting the possibilities, but rather of training these artificial intelligences to handle a wide range of possible scenarios while framing and securing the use cases that could result from malicious computer hacking.

Ensuring a secure web environment requires a multi-layered approach that goes beyond safeguarding the AI models themselves. Equal vigilance must be applied at the integration points, where we erect robust firewalls and implement stringent access controls. These protective measures aim to prevent artificial intelligence from inadvertently or maliciously gaining entry to sensitive resources or confidential information that could compromise the safety and well-being of individuals, imperil organizations, or even threaten the integrity of the entire system.

Thus, emerging risks, such as jailbreaking, aimed at deceiving an artificial intelligence devoid of physical senses such as sight, hearing, and spatial awareness, allowing the authentication of a person, a company, or a system, will have to be compensated by other supervision and protection mechanisms.

It is on this note that this article concludes. We are living in an era of transition rich in exciting developments, and it will be up to you to build the Internet of tomorrow: the Augmented Internet.

🖖

Categories
Artificial Intelligence ChatGPT GPT3 GPT4 IT Architecture IT Engineering

API Hero 🤖” – The #GPT That Codes the API for You 🙌

APIs are key to scaling your #business within the global ecosystem. Moreover, your API is a fundamental building block for augmenting universally accessible #AI services, like ChatGPT.

Building an #API, however, can be daunting for non-IT individuals and junior engineers, as it involves complex concepts like API schema, selecting libraries, defining endpoints, and implementing authentication, among others.
On the other hand, for an expert backend #engineer, constructing your fiftieth API may feel repetitive.

That’s where “API Hero” comes in, specifically designed to address these challenges.

Consider an API for managing an “#Agile Planning Poker”. Given a list of functions in plain English, such as “Create Planning Poker”, “Add Participants”, “Estimate User Story”, etc., (including AI-suggested ones), the GPT will generate:

  1. The public interface of the API (for engineers, this corresponds to the OpenAPI/Swagger spec).
  2. #Code in the chosen #programming language, with a focus on modularity and GIT-friendly project structure.
  3. Features like API security, configuration management, and log management.
  4. An option to download the complete code package (no more copy-pasting needed 💪).

And there’s more!

Search for “API Hero 🤖| AMASE.io” on #ChatGPT’s GPT store. Give it a try and send your feedback for further improvement.

By the way:

  1. Currently, GPTs are accessible only to ChatGPT Plus users.
  2. If you want to know more about the decisive nature of API for your business, check my article/podcast “Why API are Fundamental to your Business”.

Link to the GPT: https://chat.openai.com/g/g-a5yLRJA1J-api-hero-amase-io

🫡

Categories
In 2060 AR/VR Artificial Intelligence Autonomous Agents Bioengineering Drone Fleet Management Hologram Holographic Display Information Technology Storytelling Technology Transportation Drone Writing

Bioengineering in 2060

“Nasir, check this out. Didn’t I tell you PSG would win against New Manchester? Two bitcoins, baby. Who is the soccer king?”

“Stop bragging, man. Gee, I don’t know how you do it. Hey, Betmania, tell me how this regular human with his XXL ears looking like satellite dishes can beat your prediction. Thank God, you were a freebie AI.”

The bet bot replied, “I am not qualified to review Dr Anoli’s performances. Yet, his 97.26% accuracy is…”

Nasir interrupted the hologram: “Ahh, shut it! It was a rhetorical question. My man! You are good, you gooood.”

He lifted his hand, nodding repetitively to perform his most vigorous handshake.

The upper deck lit up like a lighthouse, and then a deafening sound followed the illumination.
Beeeeeepp. “Emergency. Purple alert. All medical engineers on deck. I repeat, Purple Alert. All medical engineers on deck. This is not a drill.”

My heart was pounding. The message was still resonating in my head.

The sudden drop in temperature, produced by the arch-reactor of the medical drone transporter, announced the arrival of an unusual patient. This is the first time I have seen a flying one. Usually, these vehicles were stored in the Corps of Peacekeepers‘ R&D facilities.

The temperature drop caused a fog to rise like a curtain. I saw a blurry figure approaching slowly.

My colleague interrupted my stupor: “I haven’t seen a purple alert for 7 years now, it’s serious. Purple means death.”

The flying ambulance emerged at the heliport located at the center of the critical emergency service. Work is pretty easy on platform seventeen, actually. Nowadays, bioengineering is solving almost all critical injuries as they occur. An accident at a construction site? Any multifunction robot worker medic cauterizes your wound with accelerated healing enzymes, bypassing block surgery. As long as you’ve subscribed to the right medical service, the AI health model can be downloaded, and the tier two healing kit purchased for printing.

“What on earth could have happened?”, exclaimed Nasir.

The hologram of the AI diagnostician, Dr. Ernesto, popped up from the ambient hospital Phygital network while the paramedics were transporting the patient out of the medical drone transporter. Then, she said:

“This is unprecedented, we are losing it. We’ve tried traditional medicine and printed the generic assistance bacteria. Nothing works. The patient’s vitals fluctuate over time at an unconventional rate. Scanning the available knowledge from the current corpus does not provide any satisfying answers. The highest probable disease is the Genova IV virus with a probability of 31%”.

I replied “Too much uncertainty… What do you suggest, Nasir?”

“It looks like the emerging meta-virus. The… guy on B-Hacker YouTube Entertainment System… Sorry, I can’t remember the name of that journalist… said it was an open-source public CASPRed experiment.”

I could not believe it at first. But when I saw his wound mutating in front of me, there was no doubt that the potential disease was human-made.

“Arrrghhhhh. It burns… Please get it off me. I am begging you. Just cut it!” shouted the patient.

I was stunned. I cannot remember the last time I saw a real person suffering that much. Dr. Krovariv told me about it. This was surreal. So much pain.

“Hey! Gather yourself, buddy. There’s no time to waste. We might be facing a level 5 threat here,” said Nasir.

He was such a great guy. He was the embodiment of coolness. Not only was he a great bioengineer, but he always kept his cool on the field. I was more of a lab genius. I wish I could be more like him.

“Huh, yeah. Sorry, I froze,” I said with a coating of stupor, disappointment, and anger. I continued without hesitation, “What’s the status?”

While he was unplugging the portable bio-scanner from the trembling body, Nasir replied: “Body temperature 39.9. Heartbeats – 137. Adrenaline level 50.5. Oxygen level 92.12%. Stress level is increasing by 2.4% per minute. If we don’t act quickly, he is going to have a seizure. The stem cell differentiation rate… 272%! This is too high. It looks like he is growing a supernumerary limb, but I don’t know what kind.”

“Pick up the nano-extractors and verify traces of xeno-mRNA.”
It was a Sanofi Nano-extractors SNX43, a state-of-the-art fleet of nano-medical machines with a portable management system. Current extractors were all managed centrally by the hospital drone intervention unit.

I quickly put my face next to the patient’s ear and asked, “Sir, I am going to need your authorization to perform an intracellular analysis. We don’t know what you have, this might be your last chance…”

He grabbed my head and screamed, “What the hell are you asking for? Do it! Arghhhh!”

I felt foolish. Yet, there was no way I could bypass the authorization because the device had a legal lock encoded in it.

“That will do,” I whispered. The nano-extractors became operational, switching from blue ‘stand by’ to green ‘proceed.’ The extractors looked like a stick, no longer than a large pen. I placed it next to the wound, and it opened, deploying four branches on each side, and anchoring onto his skin. I heard a sequence of small noises, like pistons. The extractor fleet deployed into his body.

Then a ballet of lights came up. The SNX43 beamed a 3D representation of the arm’s inner part. The system rendered the location of the fleet in real-time.

They moved surprisingly slowly. I was disappointed. I was used to contemplating much livelier robots with the previous version.

Nasir said deductively: “There’s something wrong. They appear to be sucked in or blocked. Zoom in, Nicolas”.

I pinched out the holo-display to enter the zoom command and tapped on the max button. The illusion of slowness came from the fact that the machines were moving at such a speed that it was hard to follow with the naked eye. They looked like bees on steroids, fighting a wall of flesh. But the alien cells rebuilt the wall as soon as it was destroyed.

“I get it; the cellular growth must be faster than the robots can clear the path. It is not good. Not good at all.”

“Have you ever seen something like this before?”

“No…” said Nasir. It was the first time I saw fear in his eyes.

Then he jumped. “It could be contagious. I want all containment units on the platform now. I request a total lockdown of platform seventeen. Now!”

Nasir stopped, then looked at me insistently. I immediately thought something was happening to me.

Then, Nasir froze and fell to the ground.

I jumped immediately to grab him, but something inside me pulled back. My bioengineer spinal safety device prevented me from approaching the new pathogens. Whenever it triggered, it reminded me of the auto-braking mechanisms in my grandparents’ cars. I do not like the idea of having a machine tweaking my neural system. Yet, once more, it saved my life.

“This is not the time,” I whispered. I looked at the sky and said, “Dr. Ernesto. Launch orbital containment in ten. Authorization Kappa-Sigma-Omega-1337.”

Dr. Ernesto answered, “Authorization confirmed. Launching orbital containment in 10, 9, 8…”

“Meanwhile, call the World Health Emergency.”

“Opening channel.”

The dial ended almost instantly.

“Dr. Anoli, this is Dr. Krovariv – replica number 3. State the nature of your emergency.”

“I need to speak to the real Dr. Krovariv. This is a purple priority request.”

“Accepted. Please stand by…”

Time stood still while the platform was flying to space. The quietness of linear magnetic propulsion was staggering. My god. Is Nasir going to make it?

“The patient! I am a monst…”

“Dr. Anoli. What is it about?” Her question interrupted my thoughts.

“Dr. Krovariv. I am sorry, but Nasir has been contaminated by an unknown biological agent. I am heading toward space containment. Furthermore, the patient at the source of the contamination is dying. This agent is too dangerous. Please advise on the protocol,” I finished with a frail voice.

“In this situation, the protocols are clear. Once in space containment, you may use the ‘yellow horizon’ protocol. But you know what will happen if it fails.”

“Yes… I know.”

“I am sorry, son. Since CASPR got open-sourced, we are overwhelmed by these idiots… I am waiting for your decision.”

I had a deep look at Nasir. I knew it would probably be the last time I would see him. It was probably the last time I would see anything else.
I exhaled loudly and said, “Permission to print a new species as a countermeasure.”

“Permission granted. Good luck.”

Asking was the easy part. I was her protégé.

As I was climbing space, the platform changed its path toward a new direction. The Athena Life Engineering Lab, or ALEL, was lighting up like a giant light bulb. It was my dream to one day be worthy of visiting the lab. It was proclaimed as a symbol of hope. But not like this. Not on the verge of my very own death.

ALEL was the only accredited location where humans were allowed to engineer and print new life forms. No one was allowed to penetrate the labs.
Akin to the Athena Orbital Data Centre, it was solely operated by robots and machines following a strict chain of command.

Soon after, the platform slowed down, then stopped near the lab, and the docking system engaged to secure the system. A muffled noise gave me the chills.

“Welcome, Dr. Anoli. My name is Marie Curie. You have been granted the usage of the Athena Life Engineering System by Archdoctor Krovariv of the World Health Organization on the date of 30th December 2060. Any inquiry going against the principles of life and Human Civil Rights will result in your immediate termination. Do you want to hear these principles?”

“No, thank you. I already vowed to fight for life when I became a biomedical engineer.”

“Understood. Please state your prompt, please.”

I thought to myself, “What? Just like that?”

“Marie Curie, I need you to engineer a biological agent to counter Dr. Nasir’s condition. Focus uniquely on alien gene alteration and accelerated growth features. Do not augment the agent with inorganic material. Do not add metamorphic adaptation capabilities. Encode the maximum life duration to be 72 hours. Add automatic degeneration from 70 hours as a failsafe mechanism. No reproduction. No replication. I need you to confirm the supernumerary growth first before proceeding.”

“Supernumerary growth confirmed. Your prompt is acceptable. Proceeding to the registration of the new specimen identified by Anoli-alpha-31122060-001. Beginning prototyping phase.”

As I reached the point of no return, I could not help but wonder if our life would end here. ALEL was a universal robotic womb. What would happen if I asked the Life Factory to… recreate Nasir… Is rebirth possible?


In 2060

Categories
Technology AR/VR Artificial Intelligence Information Technology

Apple wants to HUGS you

Apple unveiled an innovative #AI method for creating animated human avatars in #3D from real humans named #HUGS.

The technique means Human Gaussian Splats
It uses 3D Gaussian Splatting (= reconstruction from multiple points of view).

The features are:
🔹Recreates human avatars in 3D from video and scenes.
🔹Separates humans from static scenes in videos.
🔹Use the SMPL body model for human representation. SMPL = Skinned Multi-Person Linear Model. In essence, it is a way to render a realistic 3D model of the human body
🔹Generates animations
🔹Achieves high rendering quality at 60 FPS.

Why does this publication matter?

First, it is a clear signal that Apple is also in the AI models race.

Then, interestingly, Apple announced the Vision Pro on the 5th of June 2023, with the promise to provide a #Metaverse experience never seen before.

With HUGS, Apple pushes a foundational building block for making the #AR/#VR experience feel more like real life: the dematerialization of your avatar to increase the sentiment of intimacy and immersion.

Also, it pushes further the seamless continuum from digital to physical and vice-versa. It makes the #Phygital Experience.
Digitally generated media is essential to the future of the “Metaverse”.

Links: https://machinelearning.apple.com/research/hugs

🫡

Categories
Technology Artificial Intelligence Automation Autonomous Agents ChatGPT GPT4 Information Technology IT Architecture IT Engineering Robots Testing

Navigating the Future with Generative AI: Part 2, Prompt Over Code – The New Face of Coding

In this installment of the Generative AI series, we delve into the concept of “Prompt as new Source Code”. The ongoing revolution of generative AI allows one to amplify one’s task productivity by up to 30 times, depending on the nature of the tasks at hand. This transformation allows me to turn my design into code, eliminating almost the need for manual coding. The time spent typing, correcting typos, optimizing algorithms, and searching Stack Overflow to decipher perplexing errors, structuring the code hierarchy, and bypassing class deprecation among other tasks, are now compressed into one. This minimization of effort provides me with recurrent morale boosts, as I achieve significantly more in less time and more frequently; these instances are micro-productivity periods. To put it in perspective, I can simply think about it during the day, and have a series of conversations with my assistant while I commute. My assistant is always available. In addition, I gain focus time.

I don’t need to wait for a team to prove my concept. Furthermore, in my founder role, I have fewer occasions to write extensive requirement documents than I would when outsourcing developments during periods of parallelization. I just need to specify the guidelines once, and the AI works out the rest for me. Leveraging the  AMASE methodology to fine-tune my AI assistant epitomizes the return on investment of my expertise. Similarly, your expertise, paired with AI, becomes a powerful asset, exponentially amplifying the return on your efforts.

Today, information technology engineering is going through a quantum leap. We will explore how structured coding is being replaced by natural language. We refer to this as prompting, which essentially denotes “well-architected and elaborated thoughts”. Prompting, so to speak, is the crystallization of something that aims to minimize the loss of information and cast-out interpretation. In this vein, “What You Read is What You Thought” becomes a tangible reality.

The Unconventional Coding Experience with AI

Although the development cycle typically commences with the design phase, this aspect will not be discussed in this article. Our focus will be directed towards the coding phase instead.

The development cycle with AI is slightly different; it resembles pair programming. Programming typically involves cycles of coding and reviews, where the code is gradually improved with each iteration. An artificial intelligence model becomes your coding partner, able to code 95% of your ideas.

In essence, AI acts as a coach and a typewriter, an expert programmer with production-level knowledge of engineering. The question may arise: “Could the AI replace me completely? What is my added value as a human?”

Forming NanoTeams: Your AI Squad Awaits

My experience leads me to conclude that working with AI is akin to integrating a new teammate. This teammate will follow your instructions exactly, so clarity is essential. If you want feedback or improvements in areas like internal security or design patterns, you must communicate these desires and potentially teach the AI how to execute them.

You will need to learn to command your digital teammate.

Each AI model operates in a distinct yet somewhat similar fashion when it comes to command execution. For instance, leveraging ChatGPT to its fullest potential can be achieved through impersonations, custom instructions, and plugins. On the other hand, Midjourney excels when engaged with a moderate level of descriptiveness and a good understanding of parameter tweaking.

A New Abstraction Layer Above Coding

What exactly is coding? In essence, coding is the act of instructing a machine to perform tasks exactly as directed. The way we’ve built programming languages is to ensure they are idempotent, repeatable, reliable, and predictable. Ultimately, coding is translated into machine language, creating a version that closely resembles human language. This is evident in modern languages like TypeScript, C#, Python, and Kotlin, where instructions or controlling statements are written in plain English, such as “for each”, “while”, “switch”, etc.

With the advent of AI, we can now streamline the stage of translating our requirements into an algorithm, and then into programming code, including structuring what will ultimately be compiled to run the program. Traditionally, we organize files to ensure the code is maintainable by a human. But what if humans no longer needed to interact with the code? What if, with each iteration, AI is the one updating the code? Do we still need to organize the code in an opinionated manner, akin to a book’s table of contents, for maintainability? Or do we merely need the code to be correctly documented for human understanding, enabling engineers to update it without causing any disruptions? Indeed, AI can also fortify the code and certify it using test cases automatically, ensuring the code does not contain regressions and complies with the requirements and expected outcomes.

To expand on this, AI can generate tests, whether they be unit tests, functional tests, or performance tests. It can also create documentation, system design assets, and infrastructure design. Given that it’s all driven by a large language model, we can code the infrastructure and generate code for “Infrastructure as Code“, extending to automated deployment in CI/CD pipelines.

To conclude this paragraph, referring to my first article in the “Generative AI series”, it is apparent that Natural Language Processing is now the new programming language expressed as prompts. The Large Language Model-based generative AI model is the essential piece of software for elaborating, structuring, and completing the input text into code that can be understood both by human engineers and digital engineers.

The New Coding Paradigm

This fresh paradigm shift heralds the advent of a new form of coding—augmented coding. Augmented coding diminishes the necessity of writing code using third and fourth-generation languages, effectively condensing two activities into one.

In this scenario, the engineer seldom intervenes in the code. There may be instances where the AI generates obsolete or buggy code, but these can often be rectified promptly in the subsequent iteration.

We currently operate in an explicit coding environment, where the input code yields the visible result on the output—this is known as Input/Output coding.

The profound shift in mindset now is that the output defines the input code. To elucidate, we first articulate how the system should behave, its structure, and the rules it must adhere to. Essentially, AI has catapulted engineers across an innovation chasm, ushering in the era of Output/Input coding.

Embracing Augmented Coding: A Shift in Engineering Dynamics

The advent of augmented coding ushers in a new workflow, enhancing the synergy between engineers and AI. Below are the core aspects of this transformation:

  1. Idea Expression: The augmented engineer is impelled to express ideas and goals to achieve.
  2. Requirement Listing: The engineer lists the requirements.
  3. Requirement Clarification: Clarify the requirements with AI.
  4. Architecture Decisions: Express the architecture decisions (including technology to use, security compliance, information risk compliance, regulatory technical standards compliance, etc.) independently, and utilize AI to select new ones.
  5. Coding Guidelines: Declare the coding guidelines independently and sometimes consult the AI.
  6. Business Logic: Define the business logic in the form of algorithms to code.
  7. Code Validation: Run the code to validate it works as intended. This becomes the first order of acceptance tests.
  8. Code Review: assess the code to ensure it complies with the engineering guidelines adopted by the company.
  9. Synthetic Data Generation: Use AI to generate data sets that are functionally relevant for a given scenario and a persona.
  10. Mockup-API Generation: Employ AI to generate API stubs that are nearly functionally complete before their full implementation.
  11. Test Scenario Listing: Design the different test scenarios, then consult stakeholders to gather feedback and review their completeness.
  12. Test Case Generation: Make AI to generate the code of test cases. The same technique applies to security tests and performance tests.

AI can even operate in an autonomous mode to perform a part of the acceptance tests, but human intervention is mandatory at certain junctures. It’s crucial to bridge results with expectations.

Hence, when uncertainties arise, increasing the level of testing is prudent, akin to taking accountability upon acceptance tests to ensure the delivered work aligns with the expected level of compliance regarding the requirements.

Non-Negotiable Expectations

In the realm of critical business rules and non-functional requirements such as security, availability, accessibility, and compliance by design, these aspects are often considered second-class citizen features. Now that AI in coding facilitates the choice, these features can simply be activated by including them in your prompts to free you up more time to rigorously test their efficiency.

Certain requirements are tethered to industry rules and standards, indispensable for ensuring individual or collective safety in sectors like healthcare, aviation, automotive, or banking. The aim is not merely to test but to substantiate consistent performance. This underscores the need for a new breed of capabilities: Explainable AI and Verifiable AI. Reproducibility and consistency are imperative. However, in a system that evolves, attaining these might be challenging. Hence, in both traditional coding and a-coding, establishing a compliance control framework is essential to validate the system’s functionality against expected benchmarks.

To ease the process for you and your teams, consider breaking down the work into smaller, manageable chunks to expedite delivery—a practice akin to slicing a cake into easily consumable pieces to avoid indigestion. Herein, the role of an Architect remains crucial.

Yet, I ponder how long it will be before AI starts shouldering a significant portion of the tasks typically handled by an Architect.

Ultimately, the onus is on you to ensure everything is in order. At the end of the day, AI serves as a collaborative teammate, not a replacement.

Is AI Coding the Future of Coding?

The maxim “And is greater than or” resonates well when reflecting on the exponential growth of generative AI models, the burgeoning number of published research papers, and the observed productivity advantages over traditional coding. I discern that augmented coding is destined to be a predominant facet in the future landscape of information technology engineering.

Large Language Models, also known as LLMs, are already heralding a modern rendition of coding. The integration of AI in platforms like Android Studio or GitHub Copilot exemplifies this shift. Coding is now turbocharged, akin to transitioning from a conventional bicycle to an electric-powered one.

However, the realm of generative AI exhibits a limitation when it comes to pure invention. The term ‘invention’ here excludes ideas birthed from novel combinations of existing concepts. I am alluding to the genesis of truly nonexistent notions. It’s in this space that engineers are anticipated to contribute new code, for instance, in crafting new drivers for emerging hardware or devising new programming languages (likely domain-specific languages).

Furthermore, the quality of the generated code is often tethered to the richness of the training dataset. For instance, SwiftUI or Rust coding may encounter challenges owing to the scarcity of material on StackOverflow and the nascent stage of these languages. LLMs could be stymied by the evolution of code, like the introduction of new keywords in a programming language.

Nonetheless, if it can be written, it can be taught, and hence, it can be generated. A remedy to this quandary is to upload the latest changes in a prompt or a file, as exemplified by platforms like claude.ai and GPT Code Interpreter. Voilà, you’ve just upgraded your AI code assistant.

Lastly, the joy of coding—its essence as a form of creative expression—is something that resonates with many. The allure of competitive coding also hints at an exciting facet of the future.

Short-Term Transition: Embracing the Balance of Hybrid A-Coding

The initial step involves exploring and then embracing Generative AI embedded within your Integrated Development Environment (IDE). These tools serve as immediate and obvious accelerators, surpassing the capabilities of features like Intellisense. However, adapting to the proactive code generation while you type, whether it’s function implementation, loops, or SQL code, can hasten both typing and logic formulation.

Before the advent of ChatGPT or GPT-4, I used Tabnine, whose free version was astonishingly effective, adding value to daily coding routines. Now, we have options like GitHub Copilot or StableCode. Google took a clever step by directly embedding the AI model into the Android Studio Editor for Android app development. I invite you to delve into Studio Bot for more details on this integration.

Beware of Caveats During Your Short-Term Transition to Generative AI

Token Limits

Presently, coding with AI comes with limitations due to the number of input/output token generation. A token is essentially a chunk of text—either a whole word or a fragment—that the AI model can understand and analyze. This process, known as tokenization, varies between different AI models.

I view this limitation as temporary. Papers are emerging that push the token count to 1M tokens (see Scaling Transformer to 1M tokens and beyond with RMT). For instance, Claude.ai, by Anthropic, can handle 100k tokens. Fancy generating a full application documentation in one go?

Model Obsolescence

Another concern is the inherent obsolescence of the older data on which these models are trained. For example, OpenAI’s models use data up to 2022, rendering any development post that date unknown to the AI. You can mitigate this limitation by providing recent context or extending the AI model through fine-tuning.

Source Code Structure

Furthermore, Generative AI models do not directly consider folder structures, which are foundational to any coding project.

Imagine, as an engineer, interacting with a chatbot crafted for coding, where natural language could reference any file in your project. You code from a high-level perspective, while the AI handles your GIT commands, manages your gitignore file, and more.

Aider exemplifies this type of Gen AI application, serving as an ergonomic overlay in your development environment. Instead of coding in JavaScript, HTML, and CSS with React components served by a Python API using WebSocket, you simply instruct Aider to create or edit the source code with functional instructions in natural language. It takes care of the rest, considering the multiple structures and the GIT environment. This developer experience is profoundly familiar to engineers. The leverage of a Command Line Interface – or CLI, amplifies your capabilities tenfold.

Intellectual Property Concerns

Lastly, the risk of intellectual property loss and code leakage looms, especially when your code is shared with an “AI Model as a Service”, particularly if the system employs Reinforcement Learning with Human Feedback (RLHF). Companies like OpenAI are transparent about usage and how it serves in enhancing models or crafting custom models (e.g. InstructGPT). Therefore, AI Coding Models should also undergo risk assessments.

The Next Frontier: Codeless AI and the Emergence of Autonomous Agents

Names like GPT Engineer, AutoGPT, BabyAGI, and MetaGPT herald a new branch in augmented coding: the era of auto-coding.

These agents require only a minimal set of requirements and autonomously devise a plan along with a coding strategy to achieve your goal. They emulate human intelligence, either possessing the know-how or seeking necessary information online from official data sources, libraries to import, methods, and so on.

However, unless the task is relatively simple, these agents often falter on complex projects. Despite this, they already show significant promise.

They paint a picture of a future where, for a large part of our existing activities, coding may no longer be a necessity.

Hence, the prompt is the new code

If the code can be generated based on highly specific and clear specifications, then the next logical step is to consider your prompt as your new source code.

It means you can start storing your specifications instructions, expressed as prompt, then store the prompt in GIT.

CD/CC with Adversarial AI Agent
Continuous Development/Continuous Certification (CD/CC) with Adversarial AI Agent

Suddenly, Continuous Integration/Continuous Delivery (CI/CD) becomes Continuous Development/Continuous Certification (CD/CC), where the prompt enables the development of working pieces of software, which will be continuously certified by a testing agent working in adversarial mode: you continuously prove that it works as intended.

The good thing is that benefits stack up: the human specify, the AI code/deploy and the AI certify, to finish with the human using the results of the materialization of its thoughts. Finally, the AI learns along with human usage. We close the loop.

Integrating New Technology into Traditional Operating Models

AI introduces a seamless augmentation, employing the most natural form of communication—natural language, encompassing the most popular languages on Earth. It stands as the first-of-its-kind metamorphic software building block.

However, the operating model with AI isn’t novel. A generative AI model acts as an assistant, akin to a new hire, fitting seamlessly into an existing team. The workflow initiates with a stakeholder providing business requirements, while you, the lead engineer, guide the assistant engineer (i.e. your AI model) to execute the development at a rapid pace.

Alternatively, a suite of AI interactions, with the AI assuming various roles, like dev engineer, ops engineer, functional analyst, etc. can form your team. This interaction model entails externalizing the development service from the IT organization. Here, stakeholders still liaise through you, as lead engineer or architect, but you refine the specifications to the level of a fixed-price project. Once finalized, the development is entirely handed over to an autonomous agent. This scenario aligns with insourcing when the AI model is in-house, or outsourcing if the AI model is sourced as a Service, with the GPT-4 API evolving into a development service from a Third-Party Provider like OpenAI.

AI infuses innovation into a traditional model, offering stellar cost efficiency. Currently, OpenAI’s pricing for GPT-4 stands at $0.06 per 1000 input tokens and $0.12 per 1000 output tokens. Just considering code generation (excluding shifting deadlines, staffing activities, team communication, writing tasks, etc.), for 100,000 lines of code with an average of 100 tokens per line (which is extensive for standard leet code), the cost calculation is straightforward:

100,000 × 100 = 10,000,000 tokens; (10,000,000 tokens × $0.12) ÷ 1000 = $1,200. This cost equates to a mere two days of development at standard rates.

For perspective, Minecraft comprises approximately 600,000 lines of Java code. Theoretically, you could generate a Minecraft-like project for less than $10,000, including the costs of input tokens.

However, this logic is simplistic. In reality, autonomous agents undergo several iterations and corrections before devising a plan and rectifying numerous errors. The quality of your requirements directly impacts the accuracy of the generated code. Hence, mastering the art of precise and unambiguous descriptive writing becomes an indispensable skill in this new realm.

Wrap up

Now, you stand on the precipice of a new coding paradigm where design, algorithms, and prompting become your tools of creation, shaping a future yet to be fully understood…

This transformation sparks profound questions: How will generative AI and autonomous agents reshape the job market? Will educational institutions adapt to this augmented coding era? Is there a risk of losing the depth of engineering expertise we once relied upon?

And as we move forward, we can only wonder when quantum computing will introduce an era of instantaneous production, where words will have the power to change the world in real time.

🖖