The future of human-computer interactions with NLP
Disclaimer: These are a set of notes and thoughts I took from listening to Dharmesh Shah (CTO Hubspot) and Wojciech Zaremba (PM at OpenAI).
Every software company that ever was, has always said, “our product is intuitive and easy to use”. The reality is that every one of them lies. Here’s what happens for a human to use software: from the thing that you want to do, you need to translate that intent into a set of actions (drag and drops and swipes) to make the software do what we intend to do. E.g. You’re in photoshop and say that you want to remove the background, you should just be able to say “remove the background” from this image. If you’re in HubSpot you should be able to ask “how many people have signed up for our product in the last 90 days.
You should be able to just type that question in because that is your intent. You should be able to express the thing that you want, and the technology should be able to figure it out. We have the technology now on the language side to be able to understand languages, and all we really need is a translation layer.
This is a megatrend to be, which is bigger than mobile. Mobile added the translation from what you used to do on your desktop, now you can do it from anywhere. And this enabled a bunch of new use cases.
This thing is; that thing that you have been trying to learn, that software layer in which you never got good at because you never had the time. Now a billion people can use that piece of software that wasn’t even possible before because they don’t have to learn to clicks and drags to make it work. They can express the thing they want, and the software can do it.
If you were an enterprising VC is choosing the categories where this will have the biggest impact (business intelligence, reporting, B2B software is a natural fit). The beginnings of the consumer side would be with things like Alexa, but the B2B world is open to making the thing actually intuitive and make it do what I want.
That magic trick → You just do this and that and get the result that easily. Like waving a wand and getting a surprisingly good result. Once you see it, you can’t unsee it. So that when you see it, you can’t watch at the products that don’t have it in the same way.
Google is kind of this magic genie where if you give it a set of questions, it tries to come up with the answers. If this can actually be done, this is the next level to Google in that you can ask it any question, and it doesn’t just give a set of pages where the answers might be, but it actually tells you the answer.
Check out this demo of using Codex as a search engine: mullikine.github.io/posts/search-the-web-wi..
It turns out that many tools out there (Google Calendar, Microsoft Word) all have internal APIs to build plug-ins around them. So there is a sophisticated way to control these tools with these internal APIs. Today, if you want to make new more complicated behaviors from these products, you need to add a new button from every new behavior. It is possible to use Codex and tell, for instance, your calendar to schedule a meeting next week after 2 pm. It would then write the corresponding piece of code, which is the thing that you ultimately want. No need to hard code each interaction, but rather Codex can take each API on the fly.