How to design an OS for the future
Imagine that in today’s world we only had two types of cars, Ford and Chevy. All the Ford cars only work on Ford roads, and all the Chevy cars only work on Chevy roads. If you want to open a business, you have to build one for each type of road. That’s pretty much what our current smartphone app ecosystem is like right now, and it really makes no sense. Hopefully, things won’t stay like this for much longer.
Right now we’re nearing the end of the Information Age of human civilization. Some are calling the next age of human evolution, the “Augmented Age”. For many, we’re already using technology to augment our evolution and the future certainly looks like we’ll be adding artificial intelligent constructs to our augmentation even more. We already have many variants of narrow intelligence being integrated with our phones, computers, game consoles, speakers, cars, and home appliances. The problem is, our current computing and smartphone operating systems (Windows, Linux, macOS, iOS, Android, etc.) really aren’t designed to accommodate augmented evolution in a positive or efficient way.
I don’t really know anything about programming an operating system or even a smartphone app from scratch, but as a user experience critic/consultant for about 20 years and a web developer for over 20 years, I do know a lot about what a good human-computer interaction system looks like from the user perspective.
Learn from the Web
Do you know what was designed to accommodate augmented evolution in a positive and efficient way? The same thing that pushed us so quickly into the Information age! That’s right, the internet… or more specifically, the World Wide Web section of the internet.
The Web was built in a way where all sites and pages are meant to be delivered in a standard format to any web browser. Everything that we access on the web is delivered in hypertext markup language. It’s very simple. The browser handles rendering of the content and user interface so that humans can use it. I can build a website and every browser in the world can display its content. Yes, okay some people break their own websites when viewed in certain browsers because they don’t understand graceful degradation or progressive enhancement development techniques, but that’s only because we’re lazy humans. Still, in general, every interactive element has an anchor tag with a hyperlink reference attribute and all text, images, and other content is tagged with their corresponding hypertext markup as well.
I can build one website and all web browsers ever created will be able to render it in some capacity.
Your phone & computer apps are not built like that at all. I can build one website and all web browsers ever created will be able to render it in some capacity. That’s not true with apps on iOS, Android, Windows, macOS, Linux, etc. at all. What we have as computing operating systems and applications right now is akin to the early days of the world wide web where companies like AOL and Prodigy kept certain parts of the web specific to their customers. Or like how browser makers like Internet Explorer 4-6 had very specific features that only worked in Microsoft’s browser. The thing that made the information age succeed was cooperation on standards and competing on top of those, not competition between proprietary ecosystems.
Because the web is built in such a consistent manner, it is very easy to build Artificial Intelligence constructs that can read and interact with everything on the web. That’s what Siri, Google, Alexa, & Cortana already do. Those speech UI systems read the web and process its information very easily because everything is essentially structured in the exact same way. It’s easy to understand because it’s all the same. Other AI constructs interact with things like twitter or email or SMS because those are easy for anyone to plug into.
Want your AI construct to interact with iTunes or iMessage or Facetime or Snapchat or Facebook Messenger or WhatsApp? Good luck with that!
Anyone can build a web browser that creates voice commands for interacting with web pages or searching for information and reading it out loud… and anyone can create an AI construct that processes web page content for learning and evolution.
This is not at all true with applications that run on our current-day software computing operating systems. Want your AI construct to interact with iTunes or iMessage or Facetime or Snapchat or Facebook Messenger or WhatsApp? Good luck with that! Because every app on every platform is written and designed differently, there’s no easy way for a single artificial intelligence construct to interact with all of them at the same time.
Intelligent User Interface
Furthermore, it’s very easy for an intelligent web browser or AI construct to re-use or augment website content to fit into a different usage scenario. Any web browser can easily totally reformat a web page based on the user’s preference or the task at hand. Most browsers these days have a “reading mode”, but other browsers in the past actually let you override the cascading style sheet formatting of websites with your own cascading style sheet. The early mobile browsers had intelligent zoom features that let you focus in on specific areas of content very easily since the browsers and web developers at the time weren’t very good at reformatting content for smaller screens.
Most current computing operating systems are designed to be totally stupid.
The operating systems of the future should be capable of creating intelligent graphical user interfaces along with intelligent speech user interfaces. (Also see “When will smartphones be intelligent?”) Currently, none of the mobile or desktop computing ecosystems have intelligent user interfaces at all. In fact, most are designed to be totally stupid. For example, all smartphone operating systems put interactive elements at the top of the screen where a human hand can’t reach them while holding the phone. It’s an extremely stupid design left over from when smartphone screens were generally only 3.8″ diagonally. That’s just one example! Many smartphone app designers make use of completely ambiguous unintelligible icons that normal people and artificial intelligent constructs will have great difficulty understanding. In many cases, only the person who made up the icon will have any idea what it means.
If all of our applications were designed like web pages where all elements and controls were tagged the same way, an intelligent GUI would easily be able to display those controls in a way that makes sense to each particular individual user or external system. If you like a cleaner UI with icons that you’ve already learned to understand, an Intelligent GUI could modify all of your apps to display that type of UI for you. If a beginner user doesn’t understand a developer’s crazy icon design or wants to increase cognitive ease, an Intelligent GUI could modify it to show text labels that the user can instantly understand. This would be huge for increasing usability and efficiency in the age of augmentation!
Sorry, you’re stuck with the awful design that the app developer picked out.
You know what people love about interacting with things? Consistency. We love it when we walk into a building or step into the car and everything just works. The doors work the same way, the lights work the same way, everything looks like it belongs exactly where it is. Unfortunately, consistency is extremely rare in today’s computing environment. Even within your own smartphone there are huge amounts of inconsistency! Every app you open has a completely different color scheme and navigation method. If you’re an iOS user, you know that some apps have back buttons, some don’t. Sometimes swiping the edges navigates within an app, sometimes it doesn’t. Human-computer interaction designers can’t even make the most basic of page controls consistent! Scrollbars! Seriously, just look at Windows 10. Within that single ecosystem there are many many different styles of scroll bar interfaces for no good reason at all.
If an operating system was designed like a web browser, where each app tagged every part of its interface accordingly and kept its back-end proprietary programming separate, an intelligent operating system could easily force consistency within each user’s device according to the user’s preferences.
If you’re decorating your home, you’re probably going to choose furniture and wall colors that follow a specific theme. Maybe you’ll paint the walls or furniture yourself in order to make things consistent. In today’s computing operating systems and app ecosystems, that is not possible. You’re stuck with the awful design that the app developer picked out.
Once upon a time it was very easy to create custom themes for operating systems like Windows 95 and even Windows Mobile (Pocket PC 2002-6.5) where the colors you chose and the tweaks you made in the appearance control panel would apply to all of your applications, thus making for an extremely consistent and cohesive experience. If our modern operating systems and computing ecosystems had planned ahead properly we would still have global UI design control and Artificial Intelligence constructs would be able to tap into that in order to control design consistency for each user’s preference. Unfortunately, that’s not the case.
It would be so much better if our computers could alter apps according to our preferences.
If all of our applications had their front-end UI’s tagged in the same way, just like how web pages tag their UI content and controls, we could easily have system-wide artificial intelligence constructs that could control and modify the UI of every app. Not only could this be used for theme preferences, but usage preferences as well (which functions does each person need quickest access to?) Even things like smart speakers that don’t have screens could be built to pick up the structure markup of each app and surface a speech user interface to interact with them automatically! Keyboard shortcuts and voice prompts could easily be assigned by either a user or the artificial intelligence constructs installed within the operating system. Priority functions that the specific user needs quick access to would be surfaced to be most prominent on all of the user’s devices while less-used functions would be hidden in the overflow interface.
Of course, we also have this consistency problem with designing languages too, and that’s an issue Intelligent Speech User Interfaces will have to learn about as well. There are huge groups of people who speak and understand totally different languages and dialects. We have many redundant words on this Earth full of humans. Still, let’s look at websites again… since they’re all tagged the same way on the front end using hypertext markup language, it’s very easy to translate web pages into different languages. Anyone could build a web browser that does this automatically and actually we have plenty of browser plug-ins & other web services that do just that. What happens if you want to translate a smartphone’s badly-designed GUI into something that you can understand? That’s not going to happen, right? Even developers who design apps and programs need to actually make different versions for each language manually (if they even use actual language instead of made-up icons that nobody understands). Then the user has to specifically choose to install the language version that they want. That’s not efficient!
It would be so much better if our computers could alter apps according to our preferences the same way our web browsers can alter web pages according to our preferences.
We need collaborative rules to grow faster
Every big jump in the evolution of human civilization was made possible by agreeing on some sort of collaborative standards and rules. For the world wide web’s “Information Age”, that was a standard form of serving web pages to web browsers over a standard protocol and encouraging those web browsers render those web pages consistently. In the past, we’ve standardized on things like the telephone. Any plain old telephone is capable of communicating with any other telephone around the world. Electricity is distributed around the power grid without any compatibility issues. Train tracks are all the same so that trains can get across them. During the time of Rome thousands of years ago, roads were built in a consistent manner lining up wheels between the legs of two horses standing side by side so that all the wagons and chariots could travel efficiently (and modern train tracks were even designed to match that width). When we learned how to build a fire or make a wheel, we taught rules of the techniques to others so they would know how to do it right.
In the software application development and computing operating system ecosystems of today, we’ve got hardly any widespread collaborative rules, and that’s going to make jumping into the “Age of Augmentation” much more difficult.
Every big jump in the evolution of human civilization was made possible by collaborative rules.
Some of these ideas about taking lessons from web development and applying them to application development might sound familiar. There is actually already some work going on in the area of “Progressive Web Apps”. Relatively recently, web sites (in certain web browsers) have gained the ability to add functions that were normally reserved for native apps. For example, websites can now be programmed to respond to input from a device’s GPS receiver and accelerometer and touch screen. They can also function offline in some cases and can even provide push notifications. Google has started investing in these and Microsoft recently announced making PWA’s available in the Windows Store.
While I’m not sure if upgrading the world wide web to a world wide application ecosystem is exactly the right thing to do, it certainly sounds like that direction could be really great if we get it right.
Microsoft also had a similar idea in their “Universal Windows Platform” initiative where this new style of application development was supposed to make it easy to write programs for all operating system platforms and create programs that were responsive to each device’s screen size and capabilities. Unfortunately, that hasn’t panned out very well at all. Microsoft’s own developers can’t make Windows 10 apps that follow any sort of consistent GUI designs. None of them have a proper access keys interface or global voice command controls or even a decent theme structure.
If we can standardize on a front-end GUI tagging structure and offload UI design formatting and speech interfaces to the operating system and/or theme design add-ins, that could be really great for consistency and system-wide artificial intelligent construct integration. And just like the Web, front-end code could still be generated by any kind of back-end code you want to use. I can write a fully HTML5 compliant web page that’s generated using ancient ASP server-side coding. The back-end coding for our modern operating system of the future could still be anything, and should probably be modular in the same way that web servers are. If you want to code in PHP, usually you have to add that support into your web server. Why can’t our computing operating systems be that flexible and platform agnostic? They certainly should be. I don’t really need a specific operating system or web server software in order to program a website in PHP or ASP.NET or Perl or whatever. I just need that back-end development environment installed. That’s how an operating system of the future should work. Want to run a Win32 app? Let the OS install the Win32 subsystem support automatically.
Universal Front End
Of course this new idea of designing applications where the front-end GUI or Speech UI is controlled by user-selected themes installed into the operating system needs a name. I suggest “Universal Front End”. That gets the point across that we want this to be something that can be universally modified intelligently by either artificial intelligent constructs or user-generated themes. We can call the markup language used in Universal Front End design something like “UFEL” (Universal Front End Language).
Way more room for innovation
If we can separate the front-end user interface from the backend programming of our apps and make the device operating system drive the interface in a consistent manner, we’ll open up huge opportunities for innovation. You can make a new phone with a circular screen that includes a UI theme which modifies all existing apps with a new consistent UI design. Or you can make a smartphone with no screen at all and a system theme that converts all UI commands to speech commands. Or you can make a large display with a system theme that converts all UI commands to Kinect/”Minority Report” style hand gestures. Or you can make a Virtual/Augmented Reality system theme that modifies all app commands to show up in a 3D space. Or you can make a system theme that makes all apps look like they’re running on Mac OS 8.1. Or you can make a system theme that makes everything look like Star Trek’s LCARS system. The possibilities open up so widely when you don’t have to limit apps to terrible little rectangular touch screens like we do today in the case of iOS and Android.
By taking the responsibility of good GUI design away from individual developers (who, let’s face it, are terrible at it), we can open up a market for system-wide theme designers. That will generate innovation in user interface design that applies to the entire device in a consistent manner based on the preference of the individual. It will also open up innovation in artificial intelligence design that again interacts with all other apps & peripherals owned by the user in a consistent manner. All of those things are huge positives for the age of augmentation.
Remember where you came from
I think another big problem with today’s software developers and mess of mobile apps, desktop apps, VR apps, and smartwatch apps is that many of today’s developers and user experience designers haven’t kept up with the history of computing. If all of the knowledge about how something was built 30 years ago is gone, that’s a big problem that’s slowing down our growth… especially if that solution from the past could have helped us in the present. Soon it may be impossible to learn about or use the original Palm OS, Blackberry OS, Psion, Windows CE, Web OS, and various versions of Symbian, and that’s a shame because many of those systems had innovative features that are still missing in today’s most popular operating systems.
So, the operating systems of the future should absolutely be capable of emulating the history of computing! In some cases, this is already very easy. I can install Windows 3.0 on a Hyper-V virtual machine. I can run some original Xbox games from 18 years ago on a modern Xbox One. I can play ancient video games from the 80’s on an NES Classic Mini. Bring back Windows ME so you can learn from it! The World Wide Web is like the ancient Alexandria Library in Roman Egypt thousands of years ago where scholars would copy every book that they could get their hands on for the proliferation of knowledge. When we lose those archives of knowledge to fires or server shut downs, we’re doomed to repeat our mistakes. The same is true as we lose the ability to learn from programs and operating systems of the past.
The current technology high-rollers such as Apple, Google, Amazon, and maybe Microsoft probably won’t want to invest in an open collaborative structure for software, themes, artificial intelligence, and operating system development. Those companies are probably going to be more interested in owning users, locking them into specific ecosystems, and leveraging them for profits. A collaborative set of rules for designing the computing operating systems of the future would probably have to come from someplace else like the World Wide Web Consortium or some other type of standards development body. A lot of developers would need to get on board though, but I firmly believe that something like this would reinvigorate innovation and competition within the technology world as a whole as well as improve consumer adoption. Perhaps it would be wise for the big OS companies to spearhead some collaborative rules after all.