Update readme. Refactor toldg a little. Provide example files for testing by the user.

2020-08-11 16:30:30 -04:00 · 2020-08-11 16:30:30 -04:00 · 367c38592b
commit 367c38592b
parent 3138be8d17
13 changed files with 337 additions and 98 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,5 +1,6 @@
-# Ignore sensitive data
-gather.json
+# Ignore example output file
+example/processed
+example/mappings/unmatched.csv
 # ---> Python
 # Byte-compiled / optimized / DLL files
 __pycache__/
--- a/README.md
+++ b/README.md
@ -1,28 +1,87 @@
 # ledgerpy

 Scripts to transform different input formats (CSV and OFX) into ledger
-accounting files. Includes mapping language to update transaction details
-automatically.
+accounting files. The scripts allow manipulating ledger transactions based on
+CSV mapping files.

-There are other [scripts](https://github.com/ledger/ledger/wiki/CSV-Import) that
-attempt to handle the same use-cases. I have tried a couple of them, as well as
-hledger's integrated CSV import, and ran into issues or didn't like the
-usability. That's why I wrote my own scripts for my workflow. Probably not too
-useful for anybody else, but I included an example workspace to showcase how I
-use the scripts.
+Other [scripts](https://github.com/ledger/ledger/wiki/CSV-Import) attempt to
+handle the same use-case. I have tried a couple of them, as well as the
+integrated CSV import of hledger, and ran into issues with all of them. That's
+why I wrote yet another CSV to ledger tool.
+
+There are two scripts, getofx, and toldg. The former uses the Python
+[ofxtools](https://ofxtools.readthedocs.io/en/latest/) library to download bank
+transactions via OFX and stores them into CSV files.  The latter takes CSV files
+and transforms them into ledger accounting files.
+
+The OFX script works well for my workflow, but I am not sure if it would be
+beneficial to other people. My premise is that I update my accounting files at
+least once a week. Therefore, the OFX script downloads the instructions for the
+last thirty days and then merges them into the CSV file (if it exists).
+
+You might object that it makes more sense to download all the data for the first
+run and then update for consecutive runs. That was my idea in the beginning, but
+it turns out that the OFX credit card interface of my bank only returns the
+transactions of the last 60 days. Hence, I downloaded all transactions manually
+and then set up in the incremental updates. The examples directory contains an
+example configuration for this script.
+
+On the other hand, I am pretty happy with the CSV transformation script. Most of
+my workflows are editor based, and the mapping file-based approach makes it easy
+to manipulate transactions. Also, the script relies on a single configuration
+file, which makes the configuration clearer.

 ## Dependencies

- jinja2
- ofxtools
- python3.8 or higher
+The scripts rely on a couple of newer Python features, such as data-classes,
+format-strings, and the walrus-operator. Python 3.8 or later is therefore
+required to run the scripts. Additionally, the OFX script relies on ofxtools
+(`pip install ofxtools`). Transforming CSV files into the ledger format does not
+require additional packages.
+
+## Usage
+
+Invoke python3.8 with either of the scripts and provide a configuration file as
+the first argument.
+
+```
+git clone https://git.felixm.de/felixm/ledgerpy.git
+cd ledgerpy/example
+python3 ../toldg.py configs/toldg.json # or `make` alternatively
+```
+
+You can see that toldg copies and transforms the input files in `processed`
+directory using the mappings from the CSV files in the `mappings` directory.
+For transactions that do not have a matching mapping toldg creates a default
+mapping into `mappings/unmatched.csv`, as shown in the following listing.
+
+```
+expenses,UBER ABO13,credit=-29.37;date=2018/12/03
+```
+
+The first part is the new payee (or account2) for the transaction. The second
+part is the string used to match the description of the transaction. Normally,
+the script uses a string compare. If the string starts and ends with a
+frontslash `/` toldg compiles it into a regex and tries to match the
+description: `regex.match(descriptio)`. The last field is a query specification
+string in the following form:
+
+```
+field1=string1;field2=string2
+```
+
+I have added this feature to specify different payees for the same store. For
+example, sometimes a get groceries from Target and other times furniture
+(household expenses).  In case multiple mappings match a transaction the script
+uses the first match. I might change that into the mapping that has the most
+query parameters. That way one could have one default mapping (usually I get
+groceries from Target), but then override on a case per case basis without
+getting warnings.

 ## Todo

- [ ] Write this readme
- [ ] Create setup.py file
- [ ] Use OFX parser from ofxtools instead of parsing the XML
- [ ] Autoappend latest OFX data to CSV file
- [ ] Include example workspace with mock data to demo my workflow
-
+- [x] Write this readme
+- [x] Use OFX parser from ofxtools instead of parsing the XML
+- [x] Autoappend latest OFX data to CSV file
+- [x] Include example workspace with mock data to demo my workflow

--- a/example/Makefile
+++ b/example/Makefile
@ -0,0 +1,44 @@
+PY?=python3
+LEDGER=hledger
+TOLDG=$(PY) ../toldg.py
+GETOFX=$(PY) ../getofx.py
+
+TOLDG_CONFIG=configs/toldg.json
+GETOFX_CONFIG=configs/getofx.json
+LEDGER_FILES=$(wildcard processed/*.ldg)
+OUTPUTDIR=processed
+LEDGER_ALL=result.ldg
+
+all: toldg merge
+
+help:
+	@echo 'Makefile for Ledger automation                                            '
+	@echo '                                                                          '
+	@echo 'Usage:                                                                    '
+	@echo '   make getofx                         download ofx data and write to CSV '
+	@echo '   make toldg                          transform CSV files into LDG files '
+	@echo '   make merge                          merge all ldg files into one       '
+	@echo '   make bs                             show hledger balance               '
+	@echo '   make ui                             open hledger-ui in tree view       '
+	@echo '   make clean                          remove ledger files from output dir'
+	@echo '                                                                          '
+
+getofx:
+	@$(GETOFX) $(GETOFX_CONFIG)
+
+toldg:
+	@$(TOLDG) $(TOLDG_CONFIG)
+
+merge:
+	@cat $(LEDGER_FILES) > $(LEDGER_ALL)
+
+bs:
+	@echo ""
+	@$(LEDGER) bs -V --depth 2 -f $(LEDGER_ALL)
+
+ui:
+	hledger-ui -V --tree --depth 2 -f $(LEDGER_ALL)
+
+clean:
+	@[ ! -d $(OUTPUTDIR) ] || rm -rf $(OUTPUTDIR)/*
+
--- a/example/configs/getofx.json
+++ b/example/configs/getofx.json
@ -0,0 +1,28 @@
+{
+    "secret": "s3cr3t",
+    "client": {
+        "url": "https://ofx.bank.com",
+        "userid": "userid",
+        "org": "B1",
+        "clientuid": "clientuid",
+        "fid": "fid",
+        "bankid": "bankid",
+        "version": 220
+    },
+    "accounts": [
+        {
+            "name": "OFX Checking",
+            "accttype": "checking",
+            "acctid": "111111",
+            "csv_file": "inputs/2018_checking.csv",
+            "fields": [ "", "date", "description", "amount", "", "", "" ]
+        },
+        {
+            "name": "OFX Credit",
+            "accttype": "credit",
+            "acctid": "111111",
+            "csv_file": "inputs/2018_credit.csv",
+            "fields": ["date", "", "description", "", "", "amount"]
+        }
+    ]
+}
--- a/example/configs/toldg.json
+++ b/example/configs/toldg.json
@ -0,0 +1,17 @@
+{
+    "input_directory": "inputs",
+    "output_directory": "processed",
+    "mappings_directory": "mappings",
+    "csv_configs": [
+        {
+            "account1": "assets:checking",
+            "file_match_regex": ".*_checking\\.csv",
+            "fields": [ "", "date", "description", "amount", "", "", "" ]
+        },
+        {
+            "account1": "liabilities:credit",
+            "file_match_regex": ".*_credit\\.csv",
+            "fields": ["date", "", "description", "", "", "amount"]
+        }
+    ]
+}
--- a/example/inputs/2018_checking.csv
+++ b/example/inputs/2018_checking.csv
@ -0,0 +1,9 @@
+Details,Posting Date,Description,Amount,Type,Balance,Check or Slip #
+DEBIT,12/24/2018,"KROGER 12/22",-25.87,DEBIT_CARD,0,,
+DEBIT,12/20/2018,"KROGER 12/19",-47.77,DEBIT_CARD,0,,
+DEBIT,12/17/2018,"TARGET",-28.77,DEBIT_CARD,0,,
+DEBIT,12/04/2018,"TARGET",-45.33,DEBIT_CARD,0,,
+CREDIT,11/30/2018,"EMPLOYER USA LLC  QUICKBOOKS",1337.28,ACH_CREDIT,0,,
+DEBIT,11/06/2018,"KROGER 11/05",-12.97,DEBIT_CARD,3788.97,,
+CREDIT,11/02/2018,"EMPLOYER USA LLC  QUICKBOOKS",1337.50,ACH_CREDIT,0,,
+DEBIT,10/30/2018,"Payment to credit card ending in 1337 10/30",-59.90,ACCT_XFER,0,,
--- a/example/inputs/2018_credit.csv
+++ b/example/inputs/2018_credit.csv
@ -0,0 +1,6 @@
+Transaction Date,Post Date,Description,Category,Type,Amount
+12/03/2018,12/04/2018,UBER ABO13,Travel,Sale,-29.37
+11/20/2018,11/21/2018,UBER OBC3E,Travel,Sale,-30.41
+11/19/2018,11/20/2018,KROGER 1337,Groceries,Sale,-10.74
+11/18/2018,11/19/2018,METRO #83,Groceries,Sale,-12.34
+10/30/2018,10/30/2018,Payment Thank You - Web,,Payment,59.90
--- a/example/inputs/2018_manual.ldg
+++ b/example/inputs/2018_manual.ldg
@ -0,0 +1,5 @@
+
+2018/09/06 Opening Balance
+    assets:checking  $-479.86
+    equity:opening balances
+
--- a/example/mappings/checking.csv
+++ b/example/mappings/checking.csv
@ -0,0 +1,2 @@
+income:job,EMPLOYER USA LLC  QUICKBOOKS
+assets:transfers:checking-credit,/Payment to credit card ending in 1337/
--- a/example/mappings/credit.csv
+++ b/example/mappings/credit.csv
@ -0,0 +1,4 @@
+assets:transfers:checking-credit,Payment Thank You - Web
+expenses:household,TARGET,credit=-28.77;date=2018/12/17
+expenses:car,METRO #83,credit=-12.34
+expenses:car,UBER OBC3E,credit=-30.41;date=2018/11/20
--- a/example/mappings/groceries.csv
+++ b/example/mappings/groceries.csv
@ -0,0 +1,2 @@
+expenses:groceries,/KROGER/
+expenses:groceries,TARGET,credit=-45.33;date=2018/12/04
--- a/example/result.ldg
+++ b/example/result.ldg
@ -0,0 +1,57 @@
+
+2018/12/24 KROGER 12/22 ; DEBIT, 12/24/2018, KROGER 12/22, -25.87, DEBIT_CARD, 0, , 
+    expenses:groceries  $ 25.87
+    assets:checking     $ -25.87
+
+2018/12/20 KROGER 12/19 ; DEBIT, 12/20/2018, KROGER 12/19, -47.77, DEBIT_CARD, 0, , 
+    expenses:groceries  $ 47.77
+    assets:checking     $ -47.77
+
+2018/12/17 TARGET ; DEBIT, 12/17/2018, TARGET, -28.77, DEBIT_CARD, 0, , 
+    expenses:household  $ 28.77
+    assets:checking     $ -28.77
+
+2018/12/04 TARGET ; DEBIT, 12/04/2018, TARGET, -45.33, DEBIT_CARD, 0, , 
+    expenses:groceries  $ 45.33
+    assets:checking     $ -45.33
+
+2018/11/30 EMPLOYER USA LLC  QUICKBOOKS ; CREDIT, 11/30/2018, EMPLOYER USA LLC  QUICKBOOKS, 1337.28, ACH_CREDIT, 0, , 
+    income:job       $ -1337.28
+    assets:checking  $ 1337.28
+
+2018/11/06 KROGER 11/05 ; DEBIT, 11/06/2018, KROGER 11/05, -12.97, DEBIT_CARD, 3788.97, , 
+    expenses:groceries  $ 12.97
+    assets:checking     $ -12.97
+
+2018/11/02 EMPLOYER USA LLC  QUICKBOOKS ; CREDIT, 11/02/2018, EMPLOYER USA LLC  QUICKBOOKS, 1337.50, ACH_CREDIT, 0, , 
+    income:job       $ -1337.50
+    assets:checking  $ 1337.50
+
+2018/10/30 Payment to credit card ending in 1337 10/30 ; DEBIT, 10/30/2018, Payment to credit card ending in 1337 10/30, -59.90, ACCT_XFER, 0, , 
+    assets:transfers:checking-credit  $ 59.90
+    assets:checking                   $ -59.90
+
+2018/12/03 UBER ABO13 ; 12/03/2018, 12/04/2018, UBER ABO13, Travel, Sale, -29.37
+    expenses            $ 29.37
+    liabilities:credit  $ -29.37
+
+2018/11/20 UBER OBC3E ; 11/20/2018, 11/21/2018, UBER OBC3E, Travel, Sale, -30.41
+    expenses:car        $ 30.41
+    liabilities:credit  $ -30.41
+
+2018/11/19 KROGER 1337 ; 11/19/2018, 11/20/2018, KROGER 1337, Groceries, Sale, -10.74
+    expenses:groceries  $ 10.74
+    liabilities:credit  $ -10.74
+
+2018/11/18 METRO #83 ; 11/18/2018, 11/19/2018, METRO #83, Groceries, Sale, -12.34
+    expenses:car        $ 12.34
+    liabilities:credit  $ -12.34
+
+2018/10/30 Payment Thank You - Web ; 10/30/2018, 10/30/2018, Payment Thank You - Web, , Payment, 59.90
+    assets:transfers:checking-credit  $ -59.90
+    liabilities:credit                $ 59.90
+
+2018/09/06 Opening Balance
+    assets:checking  $-479.86
+    equity:opening balances
+
--- a/toldg.py
+++ b/toldg.py
@ -62,7 +62,7 @@ class CsvMapping:


@dataclass
-class LdgTransaction:
+class Transaction:
    """
    Class for ledger transaction to render into ldg file.
    """
@ -129,47 +129,64 @@ def get_mappings(mappings_directory: str) -> List[CsvMapping]:
            for m in get_mappings_from_file(f)]


-def get_transactions(csv_file, config: CsvConfig, mappings: List[CsvMapping]):
-    def date_to_date(date):
+def get_transactions(csv_file: str, config: CsvConfig) -> List[Transaction]:
+    def date_to_date(date: str) -> str:
        d = datetime.datetime.strptime(date, config.input_date_format)
        return d.strftime(config.output_date_format)

-    def flip_sign(amount):
-        if amount.startswith("-"):
-            return amount[1:]
-        return "-" + amount
+    def flip_sign(amount: str) -> str:
+        return amount[1:] if amount.startswith("-") else "-" + amount

+    def row_to_transaction(row, fields):
+        """ The user can configure the mapping of CSV fields to the three
+        required fields date, amount and description via the CsvConfig. """
+        t = {field: row[index] for index, field in fields}
+        amount = t['amount']
+        return Transaction(config.currency, flip_sign(amount), amount,
+                           date_to_date(t['date']), config.account1,
+                           "account2", t['description'], csv_file, ", ".join(row))
+
+    fields = [(i, f) for i, f in enumerate(config.fields) if f]
+    with open(csv_file, 'r') as f:
+        reader = csv.reader(f, delimiter=config.delimiter,
+                               quotechar=config.quotechar)
+        for _ in range(config.skip):
+            next(reader)
+        transactions = [row_to_transaction(row, fields)
+                        for row in reader if row]
+    return transactions
+
+
+def apply_mappings(transactions: List[Transaction], mappings: List[CsvMapping]):
    def make_equal_len(str_1, str_2):
        max_len = max(len(str_1), len(str_2))
        str_1 += " " * (max_len - len(str_1))
        str_2 += " " * (max_len - len(str_2))
        return (str_1, str_2)

-    def get_account2(transaction):
+    def get_matching_mappings(transaction):
        t = transaction
        matching_mappings = []
        for mapping in mappings:
            pattern = mapping.description_pattern
-            if type(pattern) is str and pattern == transaction.description:
-                pass
-            elif type(pattern) is re.Pattern and pattern.match(t.description):
-                pass
-            else:
+            if type(pattern) is str and pattern != transaction.description:
+                continue
+            elif type(pattern) is re.Pattern and not pattern.match(t.description):
                continue
-
            specifiers_match = True
            for attr, value in mapping.specifiers:
                if getattr(t, attr) != value:
                    specifiers_match = False
+            if not specifiers_match:
+                continue
+            matching_mappings.append(mapping)
+        return matching_mappings

-            if specifiers_match:
-                matching_mappings.append(mapping)
-
+    def get_account2(transaction):
+        matching_mappings = get_matching_mappings(transaction)
        if not matching_mappings:
            logging.info(f"No match for {transaction}.")
-            e = f"expenses,{t.description},credit={t.credit};date={t.date}\n"
-            unmatched_expenses.append(e)
-            return "expenses"
+            return ""
        elif len(matching_mappings) == 1:
            return matching_mappings[0].account2
        else:
@ -179,38 +196,23 @@ def get_transactions(csv_file, config: CsvConfig, mappings: List[CsvMapping]):
                logging.info(f"    {m}")
            return matching_mappings[0].account2

-    def row_to_transaction(row):
-        t = {field: row[index] for index, field in fields}
-        amount = t['amount']
-        t = LdgTransaction(config.currency, flip_sign(amount), amount,
-                           date_to_date(t['date']), config.account1,
-                           "", t['description'], csv_file, ", ".join(row))
-        t.account1, t.account2 = make_equal_len(t.account1, get_account2(t))
-        return t
-
-    fields = [(index, field)
-              for index, field in enumerate(config.fields) if field]
    unmatched_expenses = []
-    with open(csv_file, 'r') as f:
-        reader = csv.reader(f, delimiter=config.delimiter,
-                            quotechar=config.quotechar)
-        [next(reader) for _ in range(config.skip)]
-        transactions = [t
-                        for row in reader
-                        if row
-                        if (t := row_to_transaction(row))
-                        ]
-    return transactions, unmatched_expenses
+    for t in transactions:
+        account2 = get_account2(t)
+        if not account2:
+            unmatched_expenses.append(t)
+            account2 = "expenses"
+        t.account1, t.account2 = make_equal_len(t.account1, account2)
+    return unmatched_expenses


-def render_to_file(transactions, csv_file, ledger_file, template_file=""):
+def render_to_file(transactions: List[Transaction], csv_file: str, ledger_file: str):
    content = "".join([LEDGER_TRANSACTION_TEMPLATE.format(t=t)
                       for t in transactions])
-
    status = "no change"
    if not os.path.isfile(ledger_file):
        with open(ledger_file, 'w') as f:
-            f.write(new_content)
+            f.write(content)
            status = "new"
    else:
        with open(ledger_file, 'r') as f:
@ -223,9 +225,25 @@ def render_to_file(transactions, csv_file, ledger_file, template_file=""):
    logging.info(f"{csv_file:30} -> {ledger_file:30} | {status}")


-def main(config):
-    def file_age(file):
-        return time.time() - os.path.getmtime(file)
+def write_mappings(unmatched_transactions: List[Transaction], mappings_directory: str):
+    """ Write mappings for unmatched expenses for update by the user. """
+    if not unmatched_transactions:
+        return
+    fn = os.path.join(mappings_directory, "unmatched.csv")
+    with open(fn, 'a') as f:
+        writer = csv.writer(f)
+        for t in unmatched_transactions:
+            e = ["expenses", t.description,
+                 f"credit={t.credit};date={t.date}"]
+            writer.writerow(e)
+
+
+def process_csv_file(csv_file, mappings: List[CsvMapping], config: Config):
+    def csv_to_ldg_filename(csv_file: str, config: Config) -> str :
+        r = csv_file
+        r = r.replace(config.input_directory, config.output_directory)
+        r = r.replace(".csv", ".ldg")
+        return r

    def get_csv_config(csv_file: str, csv_configs: List[CsvConfig]) -> CsvConfig:
        cs = [c for c in csv_configs
@ -236,41 +254,28 @@ def main(config):
            raise Exception(f"More than one config for {csv_file=}.")
        return cs[0]

-    def write_unmatched_expenses(unmatched_expenses, mappings_directory):
-        if not unmatched_expenses:
-            return
-        fn = os.path.join(mappings_directory, "unmatched.csv")
-        with open(fn, 'a') as f:
-            for e in unmatched_expenses:
-                f.write(e)
+    ledger_file = csv_to_ldg_filename(csv_file, config)
+    csv_config = get_csv_config(csv_file, config.csv_configs)
+    transactions = get_transactions(csv_file, csv_config)
+    unmatched_transactions = apply_mappings(transactions, mappings)
+    write_mappings(unmatched_transactions, config.mappings_directory)
+    render_to_file(transactions, csv_file, ledger_file)

-    def csv_to_ldg_filename(csv_file: str, config: Config):
-        r = csv_file
-        r = r.replace(config.input_directory, config.output_directory)
-        r = r.replace(".csv", ".ldg")
-        return r

-    def process_csv_file(csv_file, mappings: List[CsvMapping], config: Config):
-        ledger_file = csv_to_ldg_filename(csv_file, config)
-        csv_config = get_csv_config(csv_file, config.csv_configs)
+def process_ldg_file(ldg_file: str, config: Config):
+    file_age = lambda file:  time.time() - os.path.getmtime(file)
+    dest_file = ldg_file.replace(config.input_directory, config.output_directory)
+    status = "no change"
+    if not os.path.isfile(dest_file):
+        status = "new"
+        shutil.copy(ldg_file, dest_file)
+    if file_age(dest_file) > file_age(ldg_file):
+        shutil.copy(ldg_file, dest_file)
+        status = "update"
+    logging.info(f"{ldg_file:30} -> {dest_file:30} | {status}")

-        transactions, unmatched = get_transactions(
-            csv_file, csv_config, mappings)
-        write_unmatched_expenses(unmatched, config.mappings_directory)
-        render_to_file(transactions, csv_file, ledger_file)
-
-    def process_ldg_file(ldg_file: str, config: Config):
-        dest_file = ldg_file.replace(
-            config.input_directory, config.output_directory)
-        status = "no change"
-        if not os.path.isfile(dest_file):
-            status = "new"
-            shutil.copy(ldg_file, dest_file)
-        if file_age(dest_file) > file_age(ldg_file):
-            shutil.copy(ldg_file, dest_file)
-            status = "update"
-        logging.info(f"{ldg_file:30} -> {dest_file:30} | {status}")

+def main(config):
    input_files = get_files(config.input_directory)
    config.csv_configs = [CsvConfig(**c) for c in config.csv_configs]
    mappings = get_mappings(config.mappings_directory)
@ -286,7 +291,7 @@ def main(config):

 if __name__ == "__main__":
    logging.basicConfig(stream=sys.stdout,
-                        level=logging.DEBUG,
+                        level=logging.INFO,
                        format='%(message)s')
    try:
        config_file = sys.argv[1]