Thread: General Ledger db design
I hope this isn't too far off topic. I've noticed some discussion about referential integrity, the use of nulls, and database design recently here. This is a real world situation where referential integrity needs to be broken in theory, I believe. Does anybody have any experience or knowledge of building financial accounting databases? Am I wrong about this?
The problem is that with "double entry accounting" you have records in tables that both reference other records in the same table as well as different records in other tables depending on some fairly complex logic.
For instance an invoice is a financial instrument, so the "parent record" would naturally want to be part of a company wide "journal" or "ledger." However, its child records would be actual invoice lines as well as two different sets of entries in the general ledger detail, all 3 sets of records must agree with each other on the invoice parent record total.
Double entry accounting I think in theory dictates that you break referencial integrity because you have 2, not 1, sets of records on which a parent record's total must be based. You also have a natural 3rd set of detail records, for instance invoice lines, that don't necessarily have a relationship with the general ledger detail.
The other way to do it is to have a pure journal/ledger relationship where you have a 3rd invoice header table that relates back to the journal. So every time you create/update/delete an invoice, you trigger an analogous journal entry. Not exactly normalized, but I guess that's why you have triggers.
Am I wrong? Again, I apologize if off topic, but I think this is a real world and complex example of some of the discussions here. You could use other database models besides relational, but there is nothing as powerful and as versatile as sql in my opinion. I think it's worth the problems.
Thanks.
No need to miss a message. Get email on-the-go
with Yahoo! Mail for Mobile. Get started.
The problem is that with "double entry accounting" you have records in tables that both reference other records in the same table as well as different records in other tables depending on some fairly complex logic.
For instance an invoice is a financial instrument, so the "parent record" would naturally want to be part of a company wide "journal" or "ledger." However, its child records would be actual invoice lines as well as two different sets of entries in the general ledger detail, all 3 sets of records must agree with each other on the invoice parent record total.
Double entry accounting I think in theory dictates that you break referencial integrity because you have 2, not 1, sets of records on which a parent record's total must be based. You also have a natural 3rd set of detail records, for instance invoice lines, that don't necessarily have a relationship with the general ledger detail.
The other way to do it is to have a pure journal/ledger relationship where you have a 3rd invoice header table that relates back to the journal. So every time you create/update/delete an invoice, you trigger an analogous journal entry. Not exactly normalized, but I guess that's why you have triggers.
Am I wrong? Again, I apologize if off topic, but I think this is a real world and complex example of some of the discussions here. You could use other database models besides relational, but there is nothing as powerful and as versatile as sql in my opinion. I think it's worth the problems.
Thanks.
No need to miss a message. Get email on-the-go
with Yahoo! Mail for Mobile. Get started.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/23/07 20:35, Martin Winsler wrote: > I hope this isn't too far off topic. I've noticed some > discussion about referential integrity, the use of nulls, and > database design recently here. This is a real world situation > where referential integrity needs to be broken in theory, I > believe. Does anybody have any experience or knowledge of > building financial accounting databases? Am I wrong about this? > > The problem is that with "double entry accounting" you have > records in tables that both reference other records in the same > table as well as different records in other tables depending on > some fairly complex logic. > > For instance an invoice is a financial instrument, so the "parent > record" would naturally want to be part of a company wide > "journal" or "ledger." However, its child records would be > actual invoice lines as well as two different sets of entries in > the general ledger detail, all 3 sets of records must agree with > each other on the invoice parent record total. > > Double entry accounting I think in theory dictates that you break > referencial integrity because you have 2, not 1, sets of records > on which a parent record's total must be based. You also have a > natural 3rd set of detail records, for instance invoice lines, > that don't necessarily have a relationship with the general > ledger detail. Setting the constraints to be NOT DEFERRABLE and INITIALLY DEFERRED is how you need to set such constraints. > The other way to do it is to have a pure journal/ledger > relationship where you have a 3rd invoice header table that > relates back to the journal. So every time you > create/update/delete an invoice, you trigger an analogous journal > entry. Not exactly normalized, but I guess that's why you have > triggers. > > Am I wrong? Again, I apologize if off topic, but I think this is > a real world and complex example of some of the discussions here. > You could use other database models besides relational, but there > is nothing as powerful and as versatile as sql in my opinion. I > think it's worth the problems. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF36k7S9HxQb37XmcRAraxAKDUHrMFEBlHU+l12UiWBTEsnoUyogCg3etF PGB6AjUZxOrpKR2E3G8Zir0= =aUd/ -----END PGP SIGNATURE-----
Martin Winsler wrote: > I hope this isn't too far off topic. I've noticed some discussion > about referential integrity, the use of nulls, and database design > recently here. This is a real world situation where referential > integrity needs to be broken in theory, I believe. Does anybody have > any experience or knowledge of building financial accounting > databases? Am I wrong about this? > > The problem is that with "double entry accounting" you have records in > tables that both reference other records in the same table as well as > different records in other tables depending on some fairly complex > logic. > > For instance an invoice is a financial instrument, so the "parent > record" would naturally want to be part of a company wide "journal" or > "ledger." However, its child records would be actual invoice lines as > well as two different sets of entries in the general ledger detail, > all 3 sets of records must agree with each other on the invoice parent > record total. I can only tell you what I've done in the past, you can take it from there. First, there are two very basic tables, the chart of accounts and the list of transactions. Every line in the transaction table links to one account. So far so good. Now we have AR invoices, and AP vouchers. Let's oversimplify for argument and say that when you post an AR invoice you post two entries in the transactions table, a debit to AR and a credit to sales (forget about tax and stuff for now). Likewise for an AP voucher, debit some expense account and credit AP. So now the problem is we've got at least two more potential parent tables for the transaction, being invoices and vouchers, and it gets worse when we add checks received, checks paid, overpayments, allowances and so on and so on. The solution I've always used is to introduce a table of batches. This is the table that unifies all of the others. When you post an invoice, you generate a new batch, give it type "AR". The invoice is stamped with the batch #, as are the GL transaction rows. When you post an AP voucher, do the same thing. Same for checks received, checks paid, etc, all of them have different batch types. In short, the problem of too many parents is inverted to produce many children instead, and the problem goes away. -- Kenneth Downs Secure Data Software, Inc. www.secdat.com / www.andromeda-project.org Office: 631-689-7200 Cell: 631-379-0010 ::Think you may have a problem with programming? Ask yourself this ::question: do you worry about how to throw away a garbage can?
On 02/25/2007 06:21:45 PM, Kenneth Downs wrote: > Martin Winsler wrote: >> This is a real world situation where referential >> integrity needs to be broken in theory, I believe. Does anybody >> have any experience or knowledge of building financial accounting >> databases? Am I wrong about this? >> >> The problem is that with "double entry accounting" you have records >> in tables that both reference other records in the same table as >> well as different records in other tables depending on some fairly >> complex logic. >> For instance an invoice is a financial instrument, so the "parent >> record" would naturally want to be part of a company wide "journal" >> or "ledger." However, its child records would be actual invoice >> lines as well as two different sets of entries in the general ledger >> detail, all 3 sets of records must agree with each other on the >> invoice parent record total. > The solution I've always used is to introduce a table of batches. > This is the table that unifies all of the others. When you post an > invoice, you generate a new batch, give it type "AR". The invoice is > stamped with the batch #, as are the GL transaction rows. When you > post an AP voucher, do the same thing. Same for checks received, > checks paid, etc, all of them have different batch types. It's been a while since I've done finance apps but this is my recollection of the situation. The above proposal takes care of the data structure/referential integrity issues, but does not solve the data integrity issues. The only way, at present, to solve the data integrity issues is to write a FOR EACH STATEMENT trigger to be sure that all the rows agree with each other and everything balances. But this can only be done after all the data goes into the database. For instance, insert the credit and debit rows into a temporary table, then insert from the temporary table into the actual GL transaction table in one go, and have a AFTER ... FOR EACH STATEMENT go through and make sure the entire ledger is still in balance. From a performance standpoint this bites. Of course you can insert the financial transaction rows before inserting a row in the table of batches, or whatever the parent table is. Then write triggers on the batch table to make sure everything stays in balance. Your business rules are enforced, for every batch that exists, but your referential integrity is lost and you can wind up with dangling child rows. All the same I sorta prefer this solution because it seems to me that the mess is easier to clean up. The traditional solution has always been to make sure all your applications have no bugs. They need to do everything in transactions and always insert both the credit and debit sides of every financial transaction, otherwise the ledger (or whatever) can get out of balance because one side or another of the financial transaction is missing. (I don't know why the traditional solution is so popular. Maybe because it was always done this way before ACID compliant databases. Or, could be because it puts the onus for cleaning up the mess on the accountants, and they're used to it because it's the same sort of mess they've always had to clean up. Or it could be because application programmers hate it when the db gives them errors and figure they do a good enough job that it's not a problem. I've also heard people complain about triggers because they don't manage their code base and don't know what triggers exist after a while.) You pretty much have the choice of either enforcing business rules or enforcing referential integrity, but not both. At least that was the conclusion I recalling coming to back when I was doing finance stuff. FWIW, I have long lusted after a per-row trigger that would fire on transaction commit to solve these problems. (Or any sort of trigger with access to the row data so that it can be checked.) I couldn't say whether such triggers are technically feasible, but I'm pretty sure nobody's interested enough to do the implementation. Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/26/07 01:39, Karl O. Pinc wrote: > > On 02/25/2007 06:21:45 PM, Kenneth Downs wrote: >> Martin Winsler wrote: > [snip] > The above proposal takes care of the data > structure/referential integrity > issues, but does not solve the data integrity issues. > > The only way, at present, to solve the data integrity > issues is to write a FOR EACH STATEMENT trigger to be sure that > all the rows agree with each other and everything balances. > But this can only be done after all the data goes into the database. [snip] > FWIW, I have long lusted after a per-row trigger that would > fire on transaction commit to solve these problems. > (Or any sort of trigger with access to the row > data so that it can be checked.) > I couldn't say whether such triggers are technically feasible, > but I'm pretty sure nobody's > interested enough to do the implementation. Why wouldn't deferred (commit time) constraints be adequate to the task? That's why they were designed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF4p5LS9HxQb37XmcRAqRlAJ4x1I/R2C/OHc+qLwZpz81jJRcRewCeJDz5 /TZzI8PkALsb/YSIl7wyl+4= =OTZZ -----END PGP SIGNATURE-----
Kenneth Downs wrote: > So far so good. Now we have AR invoices, and AP vouchers. Let's My apologies if this is a stupid question, but what do "AR" and "AP" stand for? I'm following this thread with interest. Ray. --------------------------------------------------------------- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie ---------------------------------------------------------------
On Feb 26, 2007, at 19:32 , Raymond O'Donnell wrote: > Kenneth Downs wrote: > >> So far so good. Now we have AR invoices, and AP vouchers. Let's > > My apologies if this is a stupid question, but what do "AR" and > "AP" stand for? I'm following this thread with interest. Accounts Receivable and Accounts Payable. Michael Glaesemann grzm seespotcode net
Karl O. Pinc wrote: > > On 02/25/2007 06:21:45 PM, Kenneth Downs wrote: >> Martin Winsler wrote: > >>> This is a real world situation where referential >>> integrity needs to be broken in theory, I believe. Does anybody >>> have any experience or knowledge of building financial accounting >>> databases? Am I wrong about this? >>> >>> The problem is that with "double entry accounting" you have records >>> in tables that both reference other records in the same table as >>> well as different records in other tables depending on some fairly >>> complex logic. >>> For instance an invoice is a financial instrument, so the "parent >>> record" would naturally want to be part of a company wide "journal" >>> or "ledger." However, its child records would be actual invoice >>> lines as well as two different sets of entries in the general ledger >>> detail, all 3 sets of records must agree with each other on the >>> invoice parent record total. > >> The solution I've always used is to introduce a table of batches. >> This is the table that unifies all of the others. When you post an >> invoice, you generate a new batch, give it type "AR". The invoice is >> stamped with the batch #, as are the GL transaction rows. When you >> post an AP voucher, do the same thing. Same for checks received, >> checks paid, etc, all of them have different batch types. > > It's been a while since I've done finance apps but > this is my recollection of the situation. > > The above proposal takes care of the data > structure/referential integrity > issues, but does not solve the data integrity issues. > > The only way, at present, to solve the data integrity > issues is to write a FOR EACH STATEMENT trigger to be sure that > all the rows agree with each other and everything balances. > But this can only be done after all the data goes into the database. > For instance, insert the credit and debit rows > into a temporary table, then insert from the temporary > table into the actual GL transaction table in one go, > and have a AFTER ... FOR EACH STATEMENT go through > and make sure the entire ledger is still in balance. > From a performance standpoint this bites. Yeah, there is going to be some kind of extra work here. My own solution is to add a "closed flag" to the batch and a calculated column on the GL entries. If the closed flag is "N", the calculated column is zero, so that the ledger remains in balance while the entries are going in one-by-one. A trigger on the batch table traps the setting of closed="Y" and sets the calculated values to the trx values, so the entire batch is committed inside of a single transaction. If the batch is not balanced, it will reject a setting of closed="Y". Other trigger code prevents new entries to a closed batch or the re-opening of a batch. -- Kenneth Downs Secure Data Software, Inc. www.secdat.com / www.andromeda-project.org Office: 631-689-7200 Cell: 631-379-0010 ::Think you may have a problem with programming? Ask yourself this ::question: do you worry about how to throw away a garbage can?
[snip] >>> Martin Winsler wrote: >>>> Does anybody have any experience or knowledge of building >>>> financial accounting databases? [snip] I too was thinking about building a double entry accounting system and I've been following this thread closely, learning a few tricks on the way :) I've been gathering up ideas on how a db schema might look by reading the gnuCash tutorials on double-entry accounting, but does anybody know if there are open source applications that have already solved this on the db layer? filipe
On 02/26/2007 07:40:17 AM, Kenneth Downs wrote: > Karl O. Pinc wrote: >> >> On 02/25/2007 06:21:45 PM, Kenneth Downs wrote: >>> Martin Winsler wrote: >> >>>> This is a real world situation where referential >>>> integrity needs to be broken in theory, I believe. >>>> The problem is that with "double entry accounting" you have >>>> records in tables that both reference other records in the same >>>> table as well as different records in other tables depending on >>>> some fairly complex logic. >>>> For instance an invoice is a financial instrument, so the "parent >>>> record" would naturally want to be part of a company wide >>>> "journal" or "ledger." However, its child records would be actual >>>> invoice lines as well as two different sets of entries in the >>>> general ledger detail, all 3 sets of records must agree with each >>>> other on the invoice parent record total. >> >>> The solution I've always used is to introduce a table of batches. >>> This is the table that unifies all of the others. When you post an >>> invoice, you generate a new batch, give it type "AR". The invoice >>> is stamped with the batch #, as are the GL transaction rows. When >>> you post an AP voucher, do the same thing. Same for checks >>> received, checks paid, etc, all of them have different batch types. > > My own solution is to add a "closed flag" to the batch and a > calculated column on the GL entries. If the closed flag is "N", the > calculated column is zero, so that the ledger remains in balance > while the entries are going in one-by-one. > > A trigger on the batch table traps the setting of closed="Y" and sets > the calculated values to the trx values, so the entire batch is > committed inside of a single transaction. If the batch is not > balanced, it will reject a setting of closed="Y". > > Other trigger code prevents new entries to a closed batch or the > re-opening of a batch. I haven't entirely focused my brain around this problem, but it seems to me that you still get rows in, e.g., the financial transaction detail table -- the credits and debits -- that don't balance out for some period of time. Forever if the application has a bug. Why is your approach better than just putting the "check that everything balances" code, plus whatever updating you want to do elsewhere in the db, directly into the parent table's trigger (the batch table) and having the application insert into the batch table last? You can put triggers into the financial transaction detail table that says that the batch id has to be valid if it exists to get your referential integrity right. You can also not allow new rows to be inserted if there is already a batch row, thus the insertion of a batch row "closes" the batch. Trying to add new credits or debits or change the values of existing credits or debits (prevented in the cr/db table's update trigger), things would cause the batch to go out of balance, are thus prevented. If something goes wrong, you've got some extra rows laying about and you can easily identify them because there's no corresponding row in in the batches table. (Your proposal has good error recovery too, but seems like it's more work to impliment, as far as having to go back and update the "closed" flag, and even more instruction needs to be given to the application programmer come time to use the db.) Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
Karl O. Pinc wrote: > > You can put triggers into the financial transaction detail table > that says that the batch id has to be valid if it exists > to get your referential integrity right. Right. > You can also not allow new rows to be inserted if there > is already a batch row, thus the insertion of a > batch row "closes" the batch. Not sure what you mean, but you can in fact have any number of open batches, on the assumption that it is a multi-user system. > Trying to add new credits > or debits or change the values of existing credits or > debits (prevented in the cr/db table's update trigger), things > would cause the batch to go out of balance, are thus > prevented. Using the batch # as the foreign key allows all batches to be isolated from each other. > > If something > goes wrong, you've got some extra rows laying about and > you can easily identify them because there's no corresponding > row in in the batches table. (Your proposal has good error > recovery too, but seems like it's more work to impliment, > as far as having to go back and update the "closed" flag, > and even more instruction needs to be given to the > application programmer come time to use the db.) Well to be honest I don't manually code any of it, I have a generator that does it, I don't trust myself to code something like that properly :) The code generator lets me do necessary things like sum the transactions to the batch row, preventing a close unless they balance, preventing an update to the batch row when it is already closed, which as a bonus prevents new rows being added, and "Distributing" (as we call it) the close flag to the transaction rows when the batch closes. > > Karl <kop@meme.com> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > -- Kenneth Downs Secure Data Software, Inc. www.secdat.com / www.andromeda-project.org Office: 631-689-7200 Cell: 631-379-0010 ::Think you may have a problem with programming? Ask yourself this ::question: do you worry about how to throw away a garbage can?
On 02/26/2007 11:41:18 AM, Kenneth Downs wrote: >> You can also not allow new rows to be inserted if there >> is already a batch row, thus the insertion of a >> batch row "closes" the batch. > > Not sure what you mean, but you can in fact have any number of open > batches, on the assumption that it is a multi-user system. I'm saying why have a close flag at all? Why not signal the close of the batch by the creation of the batch row? Karl <kop@meme.com> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/26/07 10:35, Filipe Fernandes wrote: > [snip] >>>> Martin Winsler wrote: >>>>> Does anybody have any experience or knowledge of building >>>>> financial accounting databases? > [snip] > > I too was thinking about building a double entry accounting system and > I've been following this thread closely, learning a few tricks on the way :) > > I've been gathering up ideas on how a db schema might look by reading the > gnuCash tutorials on double-entry accounting, but does anybody know if > there are open source applications that have already solved this on the db > layer? Ledger SMB (http://www.ledgersmb.org/about/) might be a place to start. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF4ySeS9HxQb37XmcRAm5rAJ0R8/GVHF/Zdycyca63aJATp4urdQCfW8+n 6Z3D1feasFASpUQS1To91Hc= =2s3j -----END PGP SIGNATURE-----
> Look at SQL-Ledger and LedgerSMB [snip] > Ledger SMB (http://www.ledgersmb.org/about/) might be a place to start. Thanks Ron and Brent for the suggestion... very much appreciated. filipe
Filipe Fernandes wrote: > [snip] > >>>> Martin Winsler wrote: >>>> >>>>> Does anybody have any experience or knowledge of building >>>>> financial accounting databases? >>>>> > [snip] > > I too was thinking about building a double entry accounting system and > I've been following this thread closely, learning a few tricks on the way :) > > I've been gathering up ideas on how a db schema might look by reading the > gnuCash tutorials on double-entry accounting, but does anybody know if > there are open source applications that have already solved this on the db > layer? > Look at SQL-Ledger and LedgerSMB Brent Wood
On 2/26/07, Filipe Fernandes <fernandes.fd@gmail.com> wrote: > [snip] > >>> Martin Winsler wrote: > >>>> Does anybody have any experience or knowledge of building > >>>> financial accounting databases? > [snip] > > I too was thinking about building a double entry accounting system and > I've been following this thread closely, learning a few tricks on the way :) > > I've been gathering up ideas on how a db schema might look by reading the > gnuCash tutorials on double-entry accounting, but does anybody know if > there are open source applications that have already solved this on the db > layer? start here: http://www.ledgersmb.org/ merlin