Update readme. Refactor toldg a little. Provide example files for testing by the user.

This commit is contained in:
Felix Martin 2020-08-11 16:30:30 -04:00
parent 3138be8d17
commit 367c38592b
13 changed files with 337 additions and 98 deletions

5
.gitignore vendored
View File

@ -1,5 +1,6 @@
# Ignore sensitive data
gather.json
# Ignore example output file
example/processed
example/mappings/unmatched.csv
# ---> Python
# Byte-compiled / optimized / DLL files
__pycache__/

View File

@ -1,28 +1,87 @@
# ledgerpy
Scripts to transform different input formats (CSV and OFX) into ledger
accounting files. Includes mapping language to update transaction details
automatically.
accounting files. The scripts allow manipulating ledger transactions based on
CSV mapping files.
There are other [scripts](https://github.com/ledger/ledger/wiki/CSV-Import) that
attempt to handle the same use-cases. I have tried a couple of them, as well as
hledger's integrated CSV import, and ran into issues or didn't like the
usability. That's why I wrote my own scripts for my workflow. Probably not too
useful for anybody else, but I included an example workspace to showcase how I
use the scripts.
Other [scripts](https://github.com/ledger/ledger/wiki/CSV-Import) attempt to
handle the same use-case. I have tried a couple of them, as well as the
integrated CSV import of hledger, and ran into issues with all of them. That's
why I wrote yet another CSV to ledger tool.
There are two scripts, getofx, and toldg. The former uses the Python
[ofxtools](https://ofxtools.readthedocs.io/en/latest/) library to download bank
transactions via OFX and stores them into CSV files. The latter takes CSV files
and transforms them into ledger accounting files.
The OFX script works well for my workflow, but I am not sure if it would be
beneficial to other people. My premise is that I update my accounting files at
least once a week. Therefore, the OFX script downloads the instructions for the
last thirty days and then merges them into the CSV file (if it exists).
You might object that it makes more sense to download all the data for the first
run and then update for consecutive runs. That was my idea in the beginning, but
it turns out that the OFX credit card interface of my bank only returns the
transactions of the last 60 days. Hence, I downloaded all transactions manually
and then set up in the incremental updates. The examples directory contains an
example configuration for this script.
On the other hand, I am pretty happy with the CSV transformation script. Most of
my workflows are editor based, and the mapping file-based approach makes it easy
to manipulate transactions. Also, the script relies on a single configuration
file, which makes the configuration clearer.
## Dependencies
- jinja2
- ofxtools
- python3.8 or higher
The scripts rely on a couple of newer Python features, such as data-classes,
format-strings, and the walrus-operator. Python 3.8 or later is therefore
required to run the scripts. Additionally, the OFX script relies on ofxtools
(`pip install ofxtools`). Transforming CSV files into the ledger format does not
require additional packages.
## Usage
Invoke python3.8 with either of the scripts and provide a configuration file as
the first argument.
```
git clone https://git.felixm.de/felixm/ledgerpy.git
cd ledgerpy/example
python3 ../toldg.py configs/toldg.json # or `make` alternatively
```
You can see that toldg copies and transforms the input files in `processed`
directory using the mappings from the CSV files in the `mappings` directory.
For transactions that do not have a matching mapping toldg creates a default
mapping into `mappings/unmatched.csv`, as shown in the following listing.
```
expenses,UBER ABO13,credit=-29.37;date=2018/12/03
```
The first part is the new payee (or account2) for the transaction. The second
part is the string used to match the description of the transaction. Normally,
the script uses a string compare. If the string starts and ends with a
frontslash `/` toldg compiles it into a regex and tries to match the
description: `regex.match(descriptio)`. The last field is a query specification
string in the following form:
```
field1=string1;field2=string2
```
I have added this feature to specify different payees for the same store. For
example, sometimes a get groceries from Target and other times furniture
(household expenses). In case multiple mappings match a transaction the script
uses the first match. I might change that into the mapping that has the most
query parameters. That way one could have one default mapping (usually I get
groceries from Target), but then override on a case per case basis without
getting warnings.
## Todo
- [ ] Write this readme
- [ ] Create setup.py file
- [ ] Use OFX parser from ofxtools instead of parsing the XML
- [ ] Autoappend latest OFX data to CSV file
- [ ] Include example workspace with mock data to demo my workflow
- [x] Write this readme
- [x] Use OFX parser from ofxtools instead of parsing the XML
- [x] Autoappend latest OFX data to CSV file
- [x] Include example workspace with mock data to demo my workflow

44
example/Makefile Normal file
View File

@ -0,0 +1,44 @@
PY?=python3
LEDGER=hledger
TOLDG=$(PY) ../toldg.py
GETOFX=$(PY) ../getofx.py
TOLDG_CONFIG=configs/toldg.json
GETOFX_CONFIG=configs/getofx.json
LEDGER_FILES=$(wildcard processed/*.ldg)
OUTPUTDIR=processed
LEDGER_ALL=result.ldg
all: toldg merge
help:
@echo 'Makefile for Ledger automation '
@echo ' '
@echo 'Usage: '
@echo ' make getofx download ofx data and write to CSV '
@echo ' make toldg transform CSV files into LDG files '
@echo ' make merge merge all ldg files into one '
@echo ' make bs show hledger balance '
@echo ' make ui open hledger-ui in tree view '
@echo ' make clean remove ledger files from output dir'
@echo ' '
getofx:
@$(GETOFX) $(GETOFX_CONFIG)
toldg:
@$(TOLDG) $(TOLDG_CONFIG)
merge:
@cat $(LEDGER_FILES) > $(LEDGER_ALL)
bs:
@echo ""
@$(LEDGER) bs -V --depth 2 -f $(LEDGER_ALL)
ui:
hledger-ui -V --tree --depth 2 -f $(LEDGER_ALL)
clean:
@[ ! -d $(OUTPUTDIR) ] || rm -rf $(OUTPUTDIR)/*

View File

@ -0,0 +1,28 @@
{
"secret": "s3cr3t",
"client": {
"url": "https://ofx.bank.com",
"userid": "userid",
"org": "B1",
"clientuid": "clientuid",
"fid": "fid",
"bankid": "bankid",
"version": 220
},
"accounts": [
{
"name": "OFX Checking",
"accttype": "checking",
"acctid": "111111",
"csv_file": "inputs/2018_checking.csv",
"fields": [ "", "date", "description", "amount", "", "", "" ]
},
{
"name": "OFX Credit",
"accttype": "credit",
"acctid": "111111",
"csv_file": "inputs/2018_credit.csv",
"fields": ["date", "", "description", "", "", "amount"]
}
]
}

View File

@ -0,0 +1,17 @@
{
"input_directory": "inputs",
"output_directory": "processed",
"mappings_directory": "mappings",
"csv_configs": [
{
"account1": "assets:checking",
"file_match_regex": ".*_checking\\.csv",
"fields": [ "", "date", "description", "amount", "", "", "" ]
},
{
"account1": "liabilities:credit",
"file_match_regex": ".*_credit\\.csv",
"fields": ["date", "", "description", "", "", "amount"]
}
]
}

View File

@ -0,0 +1,9 @@
Details,Posting Date,Description,Amount,Type,Balance,Check or Slip #
DEBIT,12/24/2018,"KROGER 12/22",-25.87,DEBIT_CARD,0,,
DEBIT,12/20/2018,"KROGER 12/19",-47.77,DEBIT_CARD,0,,
DEBIT,12/17/2018,"TARGET",-28.77,DEBIT_CARD,0,,
DEBIT,12/04/2018,"TARGET",-45.33,DEBIT_CARD,0,,
CREDIT,11/30/2018,"EMPLOYER USA LLC QUICKBOOKS",1337.28,ACH_CREDIT,0,,
DEBIT,11/06/2018,"KROGER 11/05",-12.97,DEBIT_CARD,3788.97,,
CREDIT,11/02/2018,"EMPLOYER USA LLC QUICKBOOKS",1337.50,ACH_CREDIT,0,,
DEBIT,10/30/2018,"Payment to credit card ending in 1337 10/30",-59.90,ACCT_XFER,0,,
Can't render this file because it has a wrong number of fields in line 2.

View File

@ -0,0 +1,6 @@
Transaction Date,Post Date,Description,Category,Type,Amount
12/03/2018,12/04/2018,UBER ABO13,Travel,Sale,-29.37
11/20/2018,11/21/2018,UBER OBC3E,Travel,Sale,-30.41
11/19/2018,11/20/2018,KROGER 1337,Groceries,Sale,-10.74
11/18/2018,11/19/2018,METRO #83,Groceries,Sale,-12.34
10/30/2018,10/30/2018,Payment Thank You - Web,,Payment,59.90
1 Transaction Date Post Date Description Category Type Amount
2 12/03/2018 12/04/2018 UBER ABO13 Travel Sale -29.37
3 11/20/2018 11/21/2018 UBER OBC3E Travel Sale -30.41
4 11/19/2018 11/20/2018 KROGER 1337 Groceries Sale -10.74
5 11/18/2018 11/19/2018 METRO #83 Groceries Sale -12.34
6 10/30/2018 10/30/2018 Payment Thank You - Web Payment 59.90

View File

@ -0,0 +1,5 @@
2018/09/06 Opening Balance
assets:checking $-479.86
equity:opening balances

View File

@ -0,0 +1,2 @@
income:job,EMPLOYER USA LLC QUICKBOOKS
assets:transfers:checking-credit,/Payment to credit card ending in 1337/
1 income:job EMPLOYER USA LLC QUICKBOOKS
2 assets:transfers:checking-credit /Payment to credit card ending in 1337/

View File

@ -0,0 +1,4 @@
assets:transfers:checking-credit,Payment Thank You - Web
expenses:household,TARGET,credit=-28.77;date=2018/12/17
expenses:car,METRO #83,credit=-12.34
expenses:car,UBER OBC3E,credit=-30.41;date=2018/11/20
1 assets:transfers:checking-credit,Payment Thank You - Web
2 expenses:household,TARGET,credit=-28.77;date=2018/12/17
3 expenses:car,METRO #83,credit=-12.34
4 expenses:car,UBER OBC3E,credit=-30.41;date=2018/11/20

View File

@ -0,0 +1,2 @@
expenses:groceries,/KROGER/
expenses:groceries,TARGET,credit=-45.33;date=2018/12/04
1 expenses:groceries,/KROGER/
2 expenses:groceries,TARGET,credit=-45.33;date=2018/12/04

57
example/result.ldg Normal file
View File

@ -0,0 +1,57 @@
2018/12/24 KROGER 12/22 ; DEBIT, 12/24/2018, KROGER 12/22, -25.87, DEBIT_CARD, 0, ,
expenses:groceries $ 25.87
assets:checking $ -25.87
2018/12/20 KROGER 12/19 ; DEBIT, 12/20/2018, KROGER 12/19, -47.77, DEBIT_CARD, 0, ,
expenses:groceries $ 47.77
assets:checking $ -47.77
2018/12/17 TARGET ; DEBIT, 12/17/2018, TARGET, -28.77, DEBIT_CARD, 0, ,
expenses:household $ 28.77
assets:checking $ -28.77
2018/12/04 TARGET ; DEBIT, 12/04/2018, TARGET, -45.33, DEBIT_CARD, 0, ,
expenses:groceries $ 45.33
assets:checking $ -45.33
2018/11/30 EMPLOYER USA LLC QUICKBOOKS ; CREDIT, 11/30/2018, EMPLOYER USA LLC QUICKBOOKS, 1337.28, ACH_CREDIT, 0, ,
income:job $ -1337.28
assets:checking $ 1337.28
2018/11/06 KROGER 11/05 ; DEBIT, 11/06/2018, KROGER 11/05, -12.97, DEBIT_CARD, 3788.97, ,
expenses:groceries $ 12.97
assets:checking $ -12.97
2018/11/02 EMPLOYER USA LLC QUICKBOOKS ; CREDIT, 11/02/2018, EMPLOYER USA LLC QUICKBOOKS, 1337.50, ACH_CREDIT, 0, ,
income:job $ -1337.50
assets:checking $ 1337.50
2018/10/30 Payment to credit card ending in 1337 10/30 ; DEBIT, 10/30/2018, Payment to credit card ending in 1337 10/30, -59.90, ACCT_XFER, 0, ,
assets:transfers:checking-credit $ 59.90
assets:checking $ -59.90
2018/12/03 UBER ABO13 ; 12/03/2018, 12/04/2018, UBER ABO13, Travel, Sale, -29.37
expenses $ 29.37
liabilities:credit $ -29.37
2018/11/20 UBER OBC3E ; 11/20/2018, 11/21/2018, UBER OBC3E, Travel, Sale, -30.41
expenses:car $ 30.41
liabilities:credit $ -30.41
2018/11/19 KROGER 1337 ; 11/19/2018, 11/20/2018, KROGER 1337, Groceries, Sale, -10.74
expenses:groceries $ 10.74
liabilities:credit $ -10.74
2018/11/18 METRO #83 ; 11/18/2018, 11/19/2018, METRO #83, Groceries, Sale, -12.34
expenses:car $ 12.34
liabilities:credit $ -12.34
2018/10/30 Payment Thank You - Web ; 10/30/2018, 10/30/2018, Payment Thank You - Web, , Payment, 59.90
assets:transfers:checking-credit $ -59.90
liabilities:credit $ 59.90
2018/09/06 Opening Balance
assets:checking $-479.86
equity:opening balances

163
toldg.py
View File

@ -62,7 +62,7 @@ class CsvMapping:
@dataclass
class LdgTransaction:
class Transaction:
"""
Class for ledger transaction to render into ldg file.
"""
@ -129,47 +129,64 @@ def get_mappings(mappings_directory: str) -> List[CsvMapping]:
for m in get_mappings_from_file(f)]
def get_transactions(csv_file, config: CsvConfig, mappings: List[CsvMapping]):
def date_to_date(date):
def get_transactions(csv_file: str, config: CsvConfig) -> List[Transaction]:
def date_to_date(date: str) -> str:
d = datetime.datetime.strptime(date, config.input_date_format)
return d.strftime(config.output_date_format)
def flip_sign(amount):
if amount.startswith("-"):
return amount[1:]
return "-" + amount
def flip_sign(amount: str) -> str:
return amount[1:] if amount.startswith("-") else "-" + amount
def row_to_transaction(row, fields):
""" The user can configure the mapping of CSV fields to the three
required fields date, amount and description via the CsvConfig. """
t = {field: row[index] for index, field in fields}
amount = t['amount']
return Transaction(config.currency, flip_sign(amount), amount,
date_to_date(t['date']), config.account1,
"account2", t['description'], csv_file, ", ".join(row))
fields = [(i, f) for i, f in enumerate(config.fields) if f]
with open(csv_file, 'r') as f:
reader = csv.reader(f, delimiter=config.delimiter,
quotechar=config.quotechar)
for _ in range(config.skip):
next(reader)
transactions = [row_to_transaction(row, fields)
for row in reader if row]
return transactions
def apply_mappings(transactions: List[Transaction], mappings: List[CsvMapping]):
def make_equal_len(str_1, str_2):
max_len = max(len(str_1), len(str_2))
str_1 += " " * (max_len - len(str_1))
str_2 += " " * (max_len - len(str_2))
return (str_1, str_2)
def get_account2(transaction):
def get_matching_mappings(transaction):
t = transaction
matching_mappings = []
for mapping in mappings:
pattern = mapping.description_pattern
if type(pattern) is str and pattern == transaction.description:
pass
elif type(pattern) is re.Pattern and pattern.match(t.description):
pass
else:
if type(pattern) is str and pattern != transaction.description:
continue
elif type(pattern) is re.Pattern and not pattern.match(t.description):
continue
specifiers_match = True
for attr, value in mapping.specifiers:
if getattr(t, attr) != value:
specifiers_match = False
if not specifiers_match:
continue
matching_mappings.append(mapping)
return matching_mappings
if specifiers_match:
matching_mappings.append(mapping)
def get_account2(transaction):
matching_mappings = get_matching_mappings(transaction)
if not matching_mappings:
logging.info(f"No match for {transaction}.")
e = f"expenses,{t.description},credit={t.credit};date={t.date}\n"
unmatched_expenses.append(e)
return "expenses"
return ""
elif len(matching_mappings) == 1:
return matching_mappings[0].account2
else:
@ -179,38 +196,23 @@ def get_transactions(csv_file, config: CsvConfig, mappings: List[CsvMapping]):
logging.info(f" {m}")
return matching_mappings[0].account2
def row_to_transaction(row):
t = {field: row[index] for index, field in fields}
amount = t['amount']
t = LdgTransaction(config.currency, flip_sign(amount), amount,
date_to_date(t['date']), config.account1,
"", t['description'], csv_file, ", ".join(row))
t.account1, t.account2 = make_equal_len(t.account1, get_account2(t))
return t
fields = [(index, field)
for index, field in enumerate(config.fields) if field]
unmatched_expenses = []
with open(csv_file, 'r') as f:
reader = csv.reader(f, delimiter=config.delimiter,
quotechar=config.quotechar)
[next(reader) for _ in range(config.skip)]
transactions = [t
for row in reader
if row
if (t := row_to_transaction(row))
]
return transactions, unmatched_expenses
for t in transactions:
account2 = get_account2(t)
if not account2:
unmatched_expenses.append(t)
account2 = "expenses"
t.account1, t.account2 = make_equal_len(t.account1, account2)
return unmatched_expenses
def render_to_file(transactions, csv_file, ledger_file, template_file=""):
def render_to_file(transactions: List[Transaction], csv_file: str, ledger_file: str):
content = "".join([LEDGER_TRANSACTION_TEMPLATE.format(t=t)
for t in transactions])
status = "no change"
if not os.path.isfile(ledger_file):
with open(ledger_file, 'w') as f:
f.write(new_content)
f.write(content)
status = "new"
else:
with open(ledger_file, 'r') as f:
@ -223,9 +225,25 @@ def render_to_file(transactions, csv_file, ledger_file, template_file=""):
logging.info(f"{csv_file:30} -> {ledger_file:30} | {status}")
def main(config):
def file_age(file):
return time.time() - os.path.getmtime(file)
def write_mappings(unmatched_transactions: List[Transaction], mappings_directory: str):
""" Write mappings for unmatched expenses for update by the user. """
if not unmatched_transactions:
return
fn = os.path.join(mappings_directory, "unmatched.csv")
with open(fn, 'a') as f:
writer = csv.writer(f)
for t in unmatched_transactions:
e = ["expenses", t.description,
f"credit={t.credit};date={t.date}"]
writer.writerow(e)
def process_csv_file(csv_file, mappings: List[CsvMapping], config: Config):
def csv_to_ldg_filename(csv_file: str, config: Config) -> str :
r = csv_file
r = r.replace(config.input_directory, config.output_directory)
r = r.replace(".csv", ".ldg")
return r
def get_csv_config(csv_file: str, csv_configs: List[CsvConfig]) -> CsvConfig:
cs = [c for c in csv_configs
@ -236,41 +254,28 @@ def main(config):
raise Exception(f"More than one config for {csv_file=}.")
return cs[0]
def write_unmatched_expenses(unmatched_expenses, mappings_directory):
if not unmatched_expenses:
return
fn = os.path.join(mappings_directory, "unmatched.csv")
with open(fn, 'a') as f:
for e in unmatched_expenses:
f.write(e)
ledger_file = csv_to_ldg_filename(csv_file, config)
csv_config = get_csv_config(csv_file, config.csv_configs)
transactions = get_transactions(csv_file, csv_config)
unmatched_transactions = apply_mappings(transactions, mappings)
write_mappings(unmatched_transactions, config.mappings_directory)
render_to_file(transactions, csv_file, ledger_file)
def csv_to_ldg_filename(csv_file: str, config: Config):
r = csv_file
r = r.replace(config.input_directory, config.output_directory)
r = r.replace(".csv", ".ldg")
return r
def process_csv_file(csv_file, mappings: List[CsvMapping], config: Config):
ledger_file = csv_to_ldg_filename(csv_file, config)
csv_config = get_csv_config(csv_file, config.csv_configs)
def process_ldg_file(ldg_file: str, config: Config):
file_age = lambda file: time.time() - os.path.getmtime(file)
dest_file = ldg_file.replace(config.input_directory, config.output_directory)
status = "no change"
if not os.path.isfile(dest_file):
status = "new"
shutil.copy(ldg_file, dest_file)
if file_age(dest_file) > file_age(ldg_file):
shutil.copy(ldg_file, dest_file)
status = "update"
logging.info(f"{ldg_file:30} -> {dest_file:30} | {status}")
transactions, unmatched = get_transactions(
csv_file, csv_config, mappings)
write_unmatched_expenses(unmatched, config.mappings_directory)
render_to_file(transactions, csv_file, ledger_file)
def process_ldg_file(ldg_file: str, config: Config):
dest_file = ldg_file.replace(
config.input_directory, config.output_directory)
status = "no change"
if not os.path.isfile(dest_file):
status = "new"
shutil.copy(ldg_file, dest_file)
if file_age(dest_file) > file_age(ldg_file):
shutil.copy(ldg_file, dest_file)
status = "update"
logging.info(f"{ldg_file:30} -> {dest_file:30} | {status}")
def main(config):
input_files = get_files(config.input_directory)
config.csv_configs = [CsvConfig(**c) for c in config.csv_configs]
mappings = get_mappings(config.mappings_directory)
@ -286,7 +291,7 @@ def main(config):
if __name__ == "__main__":
logging.basicConfig(stream=sys.stdout,
level=logging.DEBUG,
level=logging.INFO,
format='%(message)s')
try:
config_file = sys.argv[1]