-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathDESCRIPTION.txt
79 lines (70 loc) · 3.89 KB
/
DESCRIPTION.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Although there are a vast number of free and non-free SQL
implementations, few of them are really good for unit testing. A
database for unit testing should be very fast on small data sets and
should not present performance or other obstacles to setting up
databases, adding and removing tables, and destroying databases. A
typical unit test run will want to create thousands of databases.
With most SQL databases, this is not achievable, and one must resort
to compromises such as keeping the tables around and only setting up
data on each unit test run. Mayfly aims to make creating an in-memory
SQL database as easy as creating any other in-memory data structure.
The leading alternative is hypersonic ( http://hsqldb.org/ ) but it
still can be a bit on the slow side (benchmarking it is one of our
tasks) and does not allow one to freely create in-memory databases in
the same way that one can create Java objects.
Mayfly is written in Java (currently works with gcj/libgcj and
Sun Java from 1.5, although exact versions may change), and
should run on any OS which provides a Java platform. It relies on the
JDBC interfaces from the Java platform (although one has the choice
of instantiating a Database object directly if one wishes to avoid
the static nature of DriverManager).
Our own unit tests are written with JUnit (
http://www.junit.org/ ) although Mayfly should be useful for unit
tests written with any unit test framework. In terms of other
dependencies, not a huge list, but some of the usual suspects:
http://jakarta.apache.org/commons/ and http://joda-time.sourceforge.net/.
If we do decide to
provide persistence (the ability to save the database to disk), the
leading candidate for that would be Prevayler (
http://www.prevayler.org/ ) or a similar mechanism,
although persistence is not very high on
the list of goals for Mayfly.
Mayfly is starting to implement enough of SQL to run real
applications (or their tests). Contributions are still welcome,
of course (we've listed a few things missing below).
Other challenges include figuring out whether the simple
implementations we currently have (for example, table scans rather
than indexes) suffice for the kinds of data sets which unit tests end
up using. And of course, Mayfly needs to be fast (see the
PerformanceTest class for some known slownesses - at the moment
Mayfly is fast for data definition, but slow for inserting rows and
the like, which reverses the usual situation with SQL databases).
A short list of SQL features which we believe are high priority are:
* More SQL operators such as LIKE and BETWEEN.
We do not envisage providing network protocols, at least not initially.
It may be possible to write this
on top of Mayfly via JDBC, or it may become part of
Mayfly later on.
Dumping the contents of a database in some kind of textual format
is useful for debugging, comparing two databases in a test,
and the like.
There is a database dumper in the SqlDumper class which dumps
to SQL. SQL is a good choice for preserving things like data
types and interchange with other SQL databases. Other formats,
like XML, YAML (compatible with Rails fixtures), or JSON
might also be interesting for interchange with non-SQL tools.
We don't regard Java serialization, XStream, or the like as
particularly interesting, as these are closely tied to
Mayfly's internal data structures.
Some of what has been implemented so far:
* The ability to very cheaply snapshot the database in memory
(both metadata and data) and restore from that point.
* Any number of tables, columns, and rows (limited by available memory and to a certain extent by query speed).
* Joins.
* WHERE clauses with =, !=, <, >, AND, OR, NOT
* Integer, decimal, string, date/timestamp,
binary (BLOB), and text (CLOB) data types.
* GROUP BY, HAVING
* ORDER BY and LIMIT/OFFSET
* Auto-increment columns
* NOT NULL, primary key, unique, and foreign key constraints