Functionality

The functionality of the current SQL language is listed here, and some ideas how NewSQL should look like. However, nothing is fixed at this time.

Case sensitivity of the language

SQL is not case sensitive

Idea: Most programming languages are case sensitive, and NewSQL should be the same. As it is easier to type lowercase then all uppercase, so the plan is to have the keywords in lower case.

Identifiers (Names)

SQL allows quoted names.

Idea: Do not support quoted names. There seems to be no good reason why a programming language should allow spaces in identifiers. It is possible to support a user friendly (graphical-) interface to a database, by assigning display names and icons to the different objects. The priority is the developer, and most developers do not like spaces in identifiers.

Standard data types

SQL supports a variety of data types, and most vendors have a different set of types.

Support the common data types (integer, string, date, decimal, binary)
Size restrictions in data types are superfluous: there is no limit how long a Java String can be, and most people do not complain about that.

Advanced data types

This are data types that are not supported by all databases, but are usefull in many cases. In SQL, this feature is largely vendor specific.

Case insensitive String data type support is required for many real-world use cases. That doesn't mean the data is stored in uppercase or lowercase - but the comparison is done insensitive (ignoring the case). Just case insensitive comparison operators are not enough: it must be possible to define a case insensitive index. This data type is maybe already a user defined data type, it is not strictly a standard data type.
Arrays of standard data types; or arrays of advances / user defined data types
To store objects alone, there is no object data type required, because is it possible to use binary. There have been many efforts to build object databases, but still most new projects are using relational databases. However to build a index on this objects, or to execute some operations on them. The question is what programming language should be supported. One idea is Java.

Large Objects

Large objects are objects that may not fit into memory. SQL 99 supports CLOB (Character Large OBject) and BLOB (Binary Large OBject).

There is no need for a special data type. If memory is a problem, the application (or database API) is responsible for loading / not loading the value into memory. But that means it must be possible to retrieve the size of a value before reading the value itself.

Autoincrement

There is no standard for autoincrement, but many vendors support it in some way.

This is a popular feature, support it on the language level if that makes sense.

Data Definition

CREATE TABLE, DROP TABLE. There is no standard for CREATE INDEX and DROP INDEX, however most database vendors support it. ALTER TABLE ADD COLUMN, and so on.

Support renaming tables.
If data definition can be done by inserting into the system tables, then moving table defintion around would be much simpler.

Check constraints

This seems to be similar to triggers. Just support triggers instead.

Referential integrity

Most database administrators are converened about referential integrity. Sometimes this feature makes live much easier (when bugs are found early using this mechanism), sometimes harder (when data has to be moved around).

Support it as a special, simple to define sort of triggers

Default values

Seems to be similar to triggers. Just support triggers.
Don't distinguish between NULL and DEFAULT values when inserting a new row.

Nullable / non-nullable columns

In SQL, columns can be defined NOT NULL.

Use rules similar to Java, where some data types are non-nullable (for example int, long, double) and some are nullable (Integer, String). This could also help to simplify the rules for NULL handling.

Data Manipulation

INSERT, UPDATE, DELETE

Support UPSERT (UPDATE if the row exists, INSERT if not). This functionality is similar to a Hashtable.put in Java. This is used when merging data.

Data Query Language

SELECT

Subqueries

This is a very popular feature.

Support it, and if possible not only inside SELECT statments, but also in UPDATES and DELETES. Wherever possible.

Expressions and Formulas

The ANSI standard does define a few operations, but there are big differences in what vendors support. In Oracle, the comparison operators are not allowed everywhere, for example.

Support what Java supports; maybe except some things like ? : because the ? character would collide with the ? in JDBC, for PreparedStatements.
LIKE in SQL is very popular. If possible continue to support this feature. However, the escape charater can be fixed to \, as the % and _ operators are also fixed.

String constants

String literals are enclosed with single quoted. There is no standard to escape special characters (newline, tab). Two single quotes are used to write a single quote inside a string constant.

Use Java rules for Strings
Allow single quotes as well as double quotes. This simplifies to use NewSQL statments in Java (using JDBC).

User defined functions

Just support calling a procedural language

Built-in Functions

There are some built-in functions defined, however every database vendor supports its own set.

Define a common set of functions, to avoid having to add external user defined functions for simple cases.

Access rights

Support a roles based access right system. However, login should be possible with user name and password only; if a user is part of multiple roles, then he should not have to specify with what role he logs wants to log in.
Support direct manipulation of the system tables, because like this there is no need for a special syntax.

Syntax rules

SQL has many hundred syntax rules.

We want to have as little syntax rules as possible. Still our language should be feature-rich, otherwise it has to be extended to be usefull.
Avoid different levels of compatibility as SQL 92 Entry Level, Intermediate Level, Full Level, because this leads to incompabilities. Extensions should be made using ways that do not change the grammar, for example using libraries. Database vendors should compete on stability, performance, ease of use, functionality, but not grammar.

Namespaces

SQL does not support namespaces, however JDBC defines 'catalogs' and 'schemas'.

The namespace of files in a filesystem is hierarchical, the namespace of Java classes is also. It looks like we should do the same. However, we want to be able to convert our language to SQL, so we have to be carefull.

NULL Handling

NULL handling in SQL is 'problematic': NULL is not equal to NULL. NULL values in indices is a separete story.

Support NULL like in Java. This change will not be easy (for the converters), define some compiler warnings if the user wrote something that may not lead to the expected result.

Joins

SQL supports inner joins, outer joins, and full outer joins.

Don't support inner joins with a list tables (SELECT... FROM TEST T1, TEST T2). Use the JOIN syntax instead (...FROM TEST T1 JOIN TEST T2...)
Don't support FULL OUTER JOIN

Union

UNION ALL and UNION DISTINCT are SQL standard, where UNION DISTINCT is the default

Support UNION ALL only. DISTINCT can be supported separately

Advanced transformations

Some databases support UNION EXCEPT, CROSS JOINS and other seldom used functionality. All this is not standardized.

Support user-defined transformations. Such transformations will be programmed using a prodecural language if the database does not natively support it.

Distinct

DISTINCT is supported in combination with SELECT and UNION

Make DISTINCT its own transformation.

Mananging databases

SQL does not define a standard way to manage databases.

Provide standard ways how to manipulate (create, backup, restore, copy, delete) entire databases or parts of it. Many databases do not allow such operations using SQL statements, but that should not be a reason not to support it.

Logging

Some databases define a way how the transaction log (or other logs) can be read. This is however complety vendor specific.

While it is not possible to dictate the format how the log has to be written to disk, it is possible to define a way to read the content of the transaction log.

Procedural language

SQL does not define procedural elements, however the big vendors support their own extensions.

Define a standard to call a procedural language (such as Java).
Just support Java.
No not add elements of a procedural language to NewSQL, as at some point calling an external function from the database will be required anyway.

Stored procedures

This is not a standard, but many vendors support it.

This is not required if user defined functions are supported.

Triggers

Triggers are called when a row is inserted, updated or deleted. Triggers are largely vendor specific.

Define a standard for triggers.
Just support calling a procedural language

Transactional functionality

COMMIT and ROLLBACK are standard. Some vendors support more functionality, such as nested transactions. There are also many transaction isolation levels possible. Some databases use locking, some versioning, for transaction isolation.

Support switching the autocommit mode on/off.
Support some kind of two-phase commit, but keep it as simple as possible.

Connection management

There is no SQL standard to connecting to a different database / connecting as a different user

Support this.

Metadata

There is no SQL standard, however the JDBC API already supports DatabaseMetaData

Support a common set of system tables (views) to read the meta data in a database independant way from within the database.
If the meta data can be modified from the user, manipulating / moving data would get simpler.

Error messages / codes

Error codes are largely vendor specific. Some applications have to check for error codes (for example, to re-try after a deadlock).

Define standard codes for all common errors wherever possible..