|
|
---------------------------------------------------------
|
|
|
SQLITE DRIVER IDEAS, ISSUES, PROPOSALS
|
|
|
Copyright (C) 2003 Jaroslaw Staniek js at iidea dot pl
|
|
|
Started: 2003-07-09
|
|
|
Kexi home page: http://www.koffice.org/kexi/
|
|
|
---------------------------------------------------
|
|
|
|
|
|
|
|
|
1. In most situations (especially on massive data operations) we do not want get types of the columns,
|
|
|
so:
|
|
|
|
|
|
PRAGMA show_datatypes = OFF;
|
|
|
|
|
|
|
|
|
2. SQLite automatically adds primary key to the table if there is no such key.
|
|
|
Such pkey column is not visible for statemets like 'select * from table',
|
|
|
'select oid,* from table' need to be executed to also get this special column.
|
|
|
|
|
|
See section '3.1 The ROWID of the most recent insert' of c_interface.html file.
|
|
|
|
|
|
|
|
|
3. For smaller tables (how small? -- add configuration for this) sqlite_get_table() 'in memory'
|
|
|
function could be used to speed up rows retrieving.
|
|
|
|
|
|
|
|
|
4. Queries entered by user in the Query Designer should be checked for syntactically or logically validity and transformed to SQLite-compatible form befor execution. It is nonsense to ask SQLite engine if the given sql statement is valid, because then we wouldn't show too detailed error message to the user.
|
|
|
|
|
|
|
|
|
5. SQLite not only doesn't handles column types but also doesn't checks value sizes, eg. it is possible to insert string of length 100 to the column of size 20.
|
|
|
These checks should be made in KexiDB SQLite engine driver. In fact for each driver these checks could be made because user wants get a descriptive, localized, friendly message what's wrong. No single engine provides this of course. We need to store such a parameters like field size in project meta-data as sqlite doesn't stores that in any convenient way. It stores only 'CREATE TABLE' statement, as is.
|
|
|
|
|
|
|
|
|
6. Possible storage methods for SQLite database embedded in Kexi project:
|
|
|
A. Single SQLite-compatible database file (let's name it: .sqlite file)
|
|
|
- Advantages: Best deal for bigger databases - no need for rewriting data form SQLite file to another,
|
|
|
fastest open and save times. DB data consumes disk space only once. Other applications that uses SQLite library could also make use of standard format of .sqlite file's contents. Kexi project and data would be easily, defacto, separated, what is considered as good method in DB programming.
|
|
|
- Disadvantages: User (who may want to transfer a database) need to know that .kexi file doesn't stores his data but .sqlite is for that.
|
|
|
|
|
|
B. Single SQLite-compatible database file embedded inside Kexi project .kexi file.
|
|
|
SQLite requires an access to a file in its own (raw) format to be available somewhere in the path. If SQLite storing layer could be patched to adding an option for seek to given file position, sqlite data can be stored after Kexi project data. When sqlite raw data file could be saved after a Kexi project's data, rewriting the project contents should be performed (and this is done quite frequently). So, finally storing both files concatenated during normal operations is risky, costly and difficult to implement cleanly.
|
|
|
- Advantages: User do not need to know that there is sqlite used in Kexi as embedded DB engine (and even if there is any sql engine). Transferring just one file between machines means successfully transferring data and project.
|
|
|
- Disadvantages: lack of everything described as advantages of A. method: difficult and costly open and save operations (unless SQLite storing layer could be patched).
|
|
|
|
|
|
Extensions and compilations of the both above methods:
|
|
|
- .sqlite files are really good compressable, so compress option can be added (if not for regular saving, then at least for "Email project & data" or 'Save As' actions. For these actions concatenating the sqlite data with Kexi project's data would be another option convenient from user's point of view.
|
|
|
|
|
|
CURRENT IMPLEMENTATION: B way is selected with above extensions added to the TODO list.
|
|
|
|
|
|
|
|
|
7. SQLite-builtin views are read-only. So the proposal is not to use them. Here is why:
|
|
|
We want have rw queries in Kexi if main table in a query is rw.
|
|
|
<DEFINITION>: Main table T in a query Q is a table that is not at 'many' side of query relations.
|
|
|
</DEFINITION>
|
|
|
<Example>:
|
|
|
table persons (name varchar, city integer);
|
|
|
table cities (id integer primary key, name varchar);
|
|
|
|
|
|
DATA: [Jarek, 1]-------[1, Warsaw]
|
|
|
/
|
|
|
[Jakub, 1]-----/
|
|
|
|
|
|
query: select * from persons, cities
|
|
|
Now: 'cities' table is the main table (in other words it is MASTER table in this query).
|
|
|
'cities' table is rw table in this query, while 'persons' table is read-only because it is at 'many' side
|
|
|
in persons-cities relation. Modifying cities.id field, appropriate persons.city values in related
|
|
|
records will be updated if there is cascade update enabled.
|
|
|
</Example>
|
|
|
IDEAS:
|
|
|
A) Query result view (table view, forms, etc.) should allow editing fields from
|
|
|
main (master) table of this query, so every field object KexiDB::Field should have a method:
|
|
|
bool KexiDB::Field::isWritable() to allow GUI check if editing is allowed. Look that given field object
|
|
|
should be allocated for given query independently from the same field allocated for table schema.
|
|
|
The first field object can disallow editing while the latter can allow editing (because it is
|
|
|
component of regular table).
|
|
|
B) Also add method for QString KexiDB::Field that returns i18n'd message about the reasons
|
|
|
of disallowing for editing given field in a context of given query.
|
|
|
|
|
|
|
|
|
----------------------------------------------------------------
|
|
|
8. ERRORS Found
|
|
|
8.1 select * from (select name from persons limit 1) limit 2
|
|
|
-should return 1 row; returns 2
|
|
|
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
HINTS:
|
|
|
|
|
|
PRAGMA table_info(table-name);
|
|
|
For each column in the named table, invoke the callback function
|
|
|
once with information about that column, including the
|
|
|
column name, data type, whether or not the column can be NULL,
|
|
|
and the default value for the column.
|
|
|
|
|
|
|
|
|
---------------------------------------------------------------
|
|
|
OPTIMIZATION:
|
|
|
|
|
|
Re: [sqlite] Questions about sqlite's join translation
|
|
|
Od:
|
|
|
D. Richard Hipp <drh-X1OJI8nnyKUAvxtiuMwx3w@public.gmane.org>
|
|
|
Odpowiedz do:
|
|
|
sqlite-users-CzDROfG0BjIdnm+yROfE0A@public.gmane.org
|
|
|
Data:
|
|
|
sobota 9 pa<70>dziernika 2004 02:59:06
|
|
|
Grupy:
|
|
|
gmane.comp.db.sqlite.general
|
|
|
Nawi<EFBFBD>zania: 1
|
|
|
|
|
|
Keith Herold wrote:
|
|
|
> The swiki says that making JOINs into a where clause is more efficient,
|
|
|
> since sqlite translates the join condition into a where clause.
|
|
|
|
|
|
When SQLite sees this:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>SELECT<EFBFBD>*<2A>FROM<4F>a<EFBFBD>JOIN<49>b<EFBFBD>ON<4F>a.x=b.y;
|
|
|
|
|
|
It translate it into the following before compiling it:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>SELECT<EFBFBD>*<2A>FROM<4F>a,<2C>b<EFBFBD>WHERE<52>a.x=b.y;
|
|
|
|
|
|
Neither form is more efficient that the other.<2E><>Both<74>will<6C>generate
|
|
|
identical code.<2E><>(There<72>are<72>subtle<6C>differences<65>on<6F>an<61>LEFT<46>OUTER
|
|
|
JOIN, but those details can be ignored when you are looking at
|
|
|
things at a high level, as we are.)
|
|
|
|
|
|
<EFBFBD>><3E>It<49>also
|
|
|
> says that you make queries more effiecient by minimizing the number of
|
|
|
> rows returned in the FROM clause as far to the left as possible in the
|
|
|
> join.<2E><>Does<65>the<68>latter<65>matter<65>if<69>you<6F>are<72>translating<6E>everything<6E>into<74>a
|
|
|
> where<72><65>clause<73>anyway?
|
|
|
>
|
|
|
|
|
|
SQLite implements joins using nested loops with the outer
|
|
|
loop formed by the first table in the join and the inner loop
|
|
|
formed by the last table in the join.<2E><>So<53>for<6F>the<68>example
|
|
|
above you would have:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>a:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>b<EFBFBD>such<EFBFBD>that<EFBFBD>b.y=a.x:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Return<EFBFBD>the<EFBFBD>row
|
|
|
|
|
|
If you reverse the order of the tables in the FROM clause like
|
|
|
this:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>SELECT<EFBFBD>*<2A>FROM<4F>b,<2C>a<EFBFBD>WHERE<52>a.x=b.y;
|
|
|
|
|
|
You should get an equivalent result on output, but SQLite will
|
|
|
implement the query differently.<2E><>Specifically<6C>it<69>does<65>this:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>b:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>a<EFBFBD>such<EFBFBD>that<EFBFBD>a.x=b.y:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Return<EFBFBD>the<EFBFBD>row
|
|
|
|
|
|
The trick is that you want to arrange the order of tables so that
|
|
|
the "such that" clause on the inner loop is able to use an index
|
|
|
to jump right to the appropriate row instead of having to do a
|
|
|
full table scan.<2E><>Suppose,<2C>for<6F>example,<2C>that<61>you<6F>have<76>an<61>index
|
|
|
on a(x) but not on b(y).<2E><>Then<65>if<69>you<6F>do<64>this:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>SELECT<EFBFBD>*<2A>FROM<4F>a,<2C>b<EFBFBD>WHERE<52>a.x=b.y;
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>a:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>b<EFBFBD>such<EFBFBD>that<EFBFBD>b.y=a.x:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Return<EFBFBD>the<EFBFBD>row
|
|
|
|
|
|
For each row in a, you have to do a full scan of table b.<2E><>So
|
|
|
the time complexity will be O(N^2).<2E><>But<75>if<69>you<6F>reverse<73>the<68>order
|
|
|
of the tables in the FROM clause, like this:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>SELECT<EFBFBD>*<2A>FROM<4F>b,<2C>a<EFBFBD>WHERE<52>b.y=a.x;
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>b:
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>For<EFBFBD>each<EFBFBD>row<EFBFBD>in<EFBFBD>a<EFBFBD>such<EFBFBD>that<EFBFBD>a.x=b.y
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Return<EFBFBD>the<EFBFBD>row
|
|
|
|
|
|
No the inner loop is able to use an index to jump directly to the
|
|
|
rows in a that it needs and does not need to do a full scan of the
|
|
|
table.<2E><>The<68>time<6D>complexity<74>drops<70>to<74>O(NlogN).
|
|
|
|
|
|
So the rule should be:<3A><>For<6F>every<72>table<6C>other<65>than<61>the<68>first,<2C>make
|
|
|
sure there is a term in the WHERE clause (or the ON or USING clause
|
|
|
if that is your preference) that lets the search jump directly to
|
|
|
the relavant rows in that table based on the results from tables to
|
|
|
the left.
|
|
|
|
|
|
Other database engines with more complex query optimizers will
|
|
|
typically attempt to reorder the tables in the FROM clause in order
|
|
|
to give you the best result.<2E><>SQLite<74>is<69>more<72>simple-minded<65>-<2D>it
|
|
|
codes whatever you tell it to code.
|
|
|
|
|
|
Before you ask, I'll point out that it makes no different whether
|
|
|
you say "a.x=b.y" or "b.y=a.x".<2E><>They<65>are<72>equivalent.<2E><>All<6C>of<6F>the
|
|
|
following generate the same code:
|
|
|
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ON<EFBFBD>a.x=b.y
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ON<EFBFBD>b.y=a.x
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>WHERE<EFBFBD>a.x=b.y
|
|
|
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>WHERE<EFBFBD>b.y=a.x
|
|
|
---------------------------------------------------------------
|
|
|
|