qapi: Tighten the regex on valid names

We already documented that qapi names should match specific
patterns (such as starting with a letter unless it was an enum
value or a downstream extension).  Tighten that from a suggestion
into a hard requirement, which frees up names beginning with a
single underscore for qapi internal usage.

The tighter regex doesn't forbid everything insane that a user
could provide (for example, a user could name a type 'Foo-lookup'
to collide with the generated 'Foo_lookup[]' for an enum 'Foo'),
but does a good job at protecting the most obvious uses, and
also happens to reserve single leading underscore for later use.

The handling of enum values starting with a digit is tricky:
commit 9fb081e introduced a subtle bug by using c_name() on
a munged value, which would allow an enum to include the
member 'q-int' in spite of our reservation.  Furthermore,
munging with a leading '_' would fail our tighter regex.  So
fix it by only munging for leading digits (which are never
ticklish in c_name()) and by using a different prefix (I
picked 'D', although any letter should do).

Add new tests, reserved-member-underscore and reserved-enum-q,
to demonstrate the tighter checking.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-22-git-send-email-eblake@redhat.com>
Message-Id: <1447883135-18020-1-git-send-email-eblake@redhat.com>
[Eric's fixup squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
diff --git a/scripts/qapi.py b/scripts/qapi.py
index d2ce9b3..3e5caa8 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -353,9 +353,11 @@
     return find_enum(discriminator_type)
 
 
-# FIXME should enforce "other than downstream extensions [...], all
-# names should begin with a letter".
-valid_name = re.compile('^[a-zA-Z_][a-zA-Z0-9_.-]*$')
+# Names must be letters, numbers, -, and _.  They must start with letter,
+# except for downstream extensions which must start with __RFQDN_.
+# Dots are only valid in the downstream extension prefix.
+valid_name = re.compile('^(__[a-zA-Z0-9.-]+_)?'
+                        '[a-zA-Z][a-zA-Z0-9_-]*$')
 
 
 def check_name(expr_info, source, name, allow_optional=False,
@@ -374,8 +376,8 @@
                                 % (source, name))
     # Enum members can start with a digit, because the generated C
     # code always prefixes it with the enum name
-    if enum_member:
-        membername = '_' + membername
+    if enum_member and membername[0].isdigit():
+        membername = 'D' + membername
     # Reserve the entire 'q_' namespace for c_name()
     if not valid_name.match(membername) or \
        c_name(membername, False).startswith('q_'):