Contribute to Open Source. Search issue labels to find the right project for you!

Speed up transaction parser


When importing a map structure via transactions that contains around 1000 lines I’m getting around 3 seconds on the parse

<img width=“1031” alt=“mentat-profile” src=“”>

Updated 27/04/2017 23:53 2 Comments

new verifier implemented in Rust


Write a new verifier for the Zcash v1 (“Sapling”) ZK-proofs from the ground up in Rust. Run both the current verifier unchanged and the new verifier in parallel so that it doesn’t take anymore time (typically), even though it takes more CPU. Wait til both verifiers have completed. If they both return true, then the combo returns true. If they both return false then the combo returns false. If they return different things than each other, then the combo logs a detailed error message (including the proof in question and the current message and so on) and then aborts the process.

(This aborting-the-process part raises questions about blockchain forks and forward progress, but I’m pretty sure it is far more important to alert users and developers and to prevent users from accidentally accepting invalid payments than it is to minimize chances of a blockchain fork. Also, as the Zcash employees well learned during the recent security incident (, I’m firmly of the opinion that the only way to communicate with most users is to abort the process.)

Updated 20/04/2017 22:15 9 Comments

Rust: Could not locate executable "rustfmt"


Description :octocat:

Could not locate executable “rustfmt”

When I try to run format a rust buffer, using (SPC m =) or (SPC : rust-format-buffer) I get the above error message. Using rustfmt via the command line works. The path variable - via (SPC ! "echo $PATH") contains the rust bin directory. Code completion via racer (located in the same directory) works just fine.

Reproduction guide :beetle:

  • Start Emacs
  • Open rust file
  • Type: SPC m =

Observed behaviour: :eyes: :broken_heart: Minibuffer displays Could not locate executable "rustfmt", source code in buffer is not formatted.

Expected behaviour: :heart: :smile: Sourcecode in buffer is formatted by rustfmt.

System Info :computer:

  • OS: gnu/linux
  • Emacs: 25.1.1
  • Spacemacs: 0.200.9
  • Spacemacs branch: master (rev. 8e1af145)
  • Graphic display: t
  • Distribution: spacemacs
  • Editing style: vim
  • Completion: helm
  • Layers: elisp (markdown csv auto-completion better-defaults emacs-lisp org (latex :variables latex-enable-auto-fill nil latex-build-command "LaTeX") bibtex git (c-c :variables c-c -default-mode-for-headers 'c -mode c-c -enable-clang-support t) semantic (python :variables python-enable-yapf-format-on-save t python-sort-imports-on-save t) (rust :variables rust-format-on-save t))
Updated 14/04/2017 10:57

[parser-utils] Determine a pattern for using `combine` to parse tuples


This is follow-up to It’s not easy to parse two streams in lock-step using combine; this would be handy for parsing maps, which iterate naturally as (k, v) pairs. In response I flatten into a sequence [k1 v1 k2 v2 ...], which combine handles just fine.

combine takes tuples to mean order: that is, (p1, p2) expects p1 to succeed (possibly consuming input) and then p2 to succeed, starting after the input that p1 consumed.

Suppose for the moment that instead (p1, p2) means to parse p1 on the first element of tuples and p2 on the second element of tuples. I don’t know how sensible this is; for example, what would (many(p1), p2) do? I’m filing this to have a place to leave thoughts as they come to us.

Updated 03/04/2017 18:22

[query] Intern values in query parser


It’s currently difficult for us to add any kind of interning to the EDN parser: see

However, in #395 I’m about to make Variable and TypedValue wrappers around an Rc. We thus have an opportunity to intern query parts within the query parser itself, even if the EDN value stream itself contains duplicate strings

This has some immediate value: not only do we get cloneable ConjoiningClauses (and other consumers of Variable and TypedValue) — the point of #395 — but also as we drop the repeated [edn::Value] parser inputs we can prune some memory.

To do this involves maintaining state for the duration of our combine parse: probably a little struct around a few InternSet<PlainSymbol> and InternSet<String> instances.

This would be threaded into the top parser (Find::find), and then down into each parser it creates. I think the simplest way to do that — avoiding lifetime and mutability issues — is to wrap our interner in an Rc and pass it by cloning. (It’s theoretically possible to use a ThreadLocal for this, but global state is a bit of a downer.)

We’d discard the interner when we’re done with the parse. A future optimization is to keep it around….

Updated 29/04/2017 15:36 1 Comments

ensure all transitive rust dependencies are fetched via our depends system.


build fail with librustzcash while using a proxy. affected repository:

output: cd /home/riordant/Documents/Project/zcash/depends/work/build/x86_64-unknown-linux-gnu/librustzcash/0.1-2078cca28b9/.; PATH=/home/riordant/Documents/Project/zcash/depends/x86_64-unknown-linux-gnu/native/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin cargo build --release Updating registry `` warning: spurious network error (2 tries remaining): [12/-2] [7] Couldn't connect to server (Unsupported proxy scheme for '') warning: spurious network error (1 tries remaining): [12/-2] [7] Couldn't connect to server (Unsupported proxy scheme for '') error: failed to fetch ``

zcash version: v1.0.8

  • OS name + version: Ubuntu 16.04.2 (VM)
  • RAM: 8GB
  • Disk size: 39GB
  • Linux kernel version (uname -a):Linux riordant-virtual-machine 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Compiler version (gcc -version): gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Updated 06/04/2017 22:23 6 Comments

[tx] Accept [a v] notation for lookup refs in entity position


This is follow-up tracking In #382, I accept only (lookup-ref a v) for lookup-refs in both entity and value position (e and v in a [:db/add e _ v] assertion). That’s contrary to Datomic, which accepts embedded vectors like [a v] in both locations, treating ambiguity in some ad-hoc way. (See the discussion in #183 for notes on this ambiguity.)

This ticket tracks accepting an embedded vector [a v] in the entity position only. We suspect that the majority of lookup refs are entity refs in this form, and it’s both shorter to type and handy to agree with Datomic.

To implement this, you’ll need to:

1) verify that entid_or_lookup_ref_or_temp_id is still used only to parse entity positions; 2) add a vector_lookup_ref function accepting [a v] around; 3) use your new function around; 4) add tests demonstrating you can used nested vectors in entity position and that you can use nested vectors in value position – but they’re not interpreted as lookup refs. This latter may already be tested by the existing tests!

Updated 28/03/2017 16:11

[tx] Handling retracting schema characteristics, like `:db/cardinality` and `:db/unique`.


After #370 (fixing #294 and #295), we still need to implement handling [:db/retract :attribute :db/* ...] of schema characteristics. This might also address [:db/retractEntity :attribute] if it comes after

The tricky thing to ensure here is that we fall back to the default correctly, and that the “schema” materialized view does not get out of sync with the in-memory state.

Updated 24/04/2017 18:09

[tx] Extract Error enums from `db/src/`


Per the discussion in, error-chain has real problems with lots of error cases. This ticket tracks implementing the approach discussed in - extract various fine-grained Error enums from the errors secion in db/src/; - use those in a small number of errors This is just not something I’ve gotten to.

I anticipate: - DB related errors (basically, rusqlite errors); - schema related errors - tx parsing errors - transacting errors

In the interim, just overload NotImplementedError.

Updated 20/03/2017 16:14

[tx] Test that transaction IDs are in the correct partition, strictly monotonic increasing, etc


This ticket tracks verifying that transaction IDs are: - in :db.part/tx; - strictly monotonically increasing; - preserved across DB close and re-open – this is really testing the partition map updating.

This is follow-up to

We should also test that :db/txInstant are: - monotonically increasing; - in a valid range.

This is follow-up to

It might be necessary to provide a clock in some form to transact, and to make the global Conn and the local TestConn provide clocks.

Updated 24/04/2017 18:09

Python module in Rust


前幾天投了今年 PyCon TW 的議程,主題是用 Rust 來撰寫 Python module,如果有上的話要來準備內容。

主題應該會先提 ctyes 和 CFFI,並示範如何使用這兩種方式來呼叫 Shared Library 內的函式。再來介紹 CPython 的 C/C++ Extension,並示範如何用 Rust 來做出 CPython 的 Extension。最後展示一些把 Rust 生態圈內幾個常見的 Library 包給 Python 使用來提昇 Python 程式效能的範例。

如果可以的話再額外加上 PyPy Extension 的內容。

Updated 05/03/2017 07:08

Update FasterPath code for Windows compatibility


Appveyor tests have been added and have shown 23 errors in FasterPath tests on Windows. This is without monkey patches or the WITH_REGRESSION environment flag.

Started with run options --seed 43247
 FAIL (0.00s) :: test_it_determins_absolute_path
Expected false to be truthy.
C:/projects/faster-path/test/absolute_test.rb:9:in `test_it_determins_absolute_path'
 FAIL (0.00s) :: test_it_returns_similar_results_to_pathname_absolute?
Expected: true
Actual: false
C:/projects/faster-path/test/absolute_test.rb:15:in `block in test_it_returns_similar_results_to_pathname_absolute?'
C:/projects/faster-path/test/absolute_test.rb:14:in `each'
C:/projects/faster-path/test/absolute_test.rb:14:in `test_it_returns_similar_results_to_pathname_absolute?'
 PASS (0.00s) :: test_it_safely_takes_nil
 PASS (0.00s) :: test_it_does_not_modify_its_argument
 FAIL (0.04s) :: test_it_returns_the_return_all_the_components_of_filename_except_the_last_one_unix_format
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/dirname_test.rb:30:in `test_it_returns_the_return_all_the_components_of_filename_except_the_last_one_unix_format'
 FAIL (0.01s) :: test_it_ignores_a_trailing_slash
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/dirname_test.rb:25:in `test_it_ignores_a_trailing_slash'
 PASS (0.00s) :: test_it_gets_dirnames_correctly
 FAIL (0.01s) :: test_it_returns_all_the_components_of_filename_except_the_last_one
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/dirname_test.rb:5:in `test_it_returns_all_the_components_of_filename_except_the_last_one'
 FAIL (0.01s) :: test_it_returns_the_return_all_the_components_of_filename_except_the_last_one_edge_cases
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/dirname_test.rb:40:in `test_it_returns_the_return_all_the_components_of_filename_except_the_last_one_edge_cases'
 PASS (0.00s) :: test_it_returns_a_string
 PASS (0.00s) :: test_refines_pathname_directory?
 PASS (0.00s) :: test_nil_behaves_the_same
 PASS (0.00s) :: test_of_Cs
 FAIL (0.01s) :: test_of_Bs
Expected: false
Actual: true
C:/projects/faster-path/test/directory_test.rb:42:in `test_of_Bs'
 PASS (0.00s) :: test_of_As
 PASS (0.00s) :: test_nil_for_directory?
 PASS (0.00s) :: test_nil_behaves_the_same
 FAIL (0.00s) :: test_refines_pathname_relative?
Expected true to not be truthy.
C:/projects/faster-path/test/refinements/relative_test.rb:14:in `test_refines_pathname_relative?'
 FAIL (0.00s) :: test_has_trailing_separator_against_pathname_implementation
Expected: true
Actual: false
C:/projects/faster-path/test/has_trailing_separator_test.rb:17:in `test_has_trailing_separator_against_pathname_implementation'
 PASS (0.00s) :: test_has_trailing_separator?
 PASS (0.00s) :: test_has_trailing_separator_with_nil
 PASS (0.00s) :: test_refines_pathname_has_trailing_separator?
 PASS (0.00s) :: test_join
 FAIL (0.00s) :: test_of_Cs
Expected: ["http://", ""]
Actual: ["", ""]
C:/projects/faster-path/test/chop_basename_test.rb:130:in `test_of_Cs'
 PASS (0.00s) :: test_nil_inputs
 FAIL (0.00s) :: test_of_As
Expected: ["aa/a//", "a"]
Actual: ["", "aa/a//a"]
C:/projects/faster-path/test/chop_basename_test.rb:83:in `test_of_As'
 PASS (0.00s) :: test_it_fixes_blank_results_to_pathname_chop_basename
 FAIL (0.02s) :: test_it_returns_similar_results_to_pathname_chop_basename_for_dot_files
a: .hello/world/ and b: .
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/chop_basename_test.rb:71:in `block (2 levels) in test_it_returns_similar_results_to_pathname_chop_basename_for_dot_files'
C:/projects/faster-path/test/chop_basename_test.rb:70:in `each'
C:/projects/faster-path/test/chop_basename_test.rb:70:in `block in test_it_returns_similar_results_to_pathname_chop_basename_for_dot_files'
C:/projects/faster-path/test/chop_basename_test.rb:67:in `each'
C:/projects/faster-path/test/chop_basename_test.rb:67:in `test_it_returns_similar_results_to_pathname_chop_basename_for_dot_files'
 FAIL (0.01s) :: test_it_chops_basename_
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: ASCII-8BIT
+# encoding: UTF-8
C:/projects/faster-path/test/chop_basename_test.rb:11:in `test_it_chops_basename_'
 FAIL (0.01s) :: test_it_returns_similar_results_to_pathname_chop_basename
a: hello/world/ and b: .
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/chop_basename_test.rb:28:in `block (2 levels) in test_it_returns_similar_results_to_pathname_chop_basename'
C:/projects/faster-path/test/chop_basename_test.rb:27:in `each'
C:/projects/faster-path/test/chop_basename_test.rb:27:in `block in test_it_returns_similar_results_to_pathname_chop_basename'
C:/projects/faster-path/test/chop_basename_test.rb:24:in `each'
C:/projects/faster-path/test/chop_basename_test.rb:24:in `test_it_returns_similar_results_to_pathname_chop_basename'
 FAIL (0.00s) :: test_of_Bs
Expected: ["../", "."]
Actual: ["", ".././"]
C:/projects/faster-path/test/chop_basename_test.rb:106:in `test_of_Bs'
 PASS (0.00s) :: test_it_returns_similar_results_to_pathname_chop_basename_for_slash
 PASS (0.00s) :: test_del_trailing_separator
 PASS (0.00s) :: test_del_trailing_separator_in_dosish_context
 PASS (0.14s) :: test_cache_file
 SKIP (0.10s) :: test_install_gem_in_bundler_vendor
 SKIP (0.10s) :: test_install_gem_in_rvm_gemset
 PASS (0.00s) :: test_it_is_blank?
 PASS (0.00s) :: test_refines_pathname_add_trailing_separator_in_dosish_context
 FAIL (0.02s) :: test_add_trailing_separator_against_pathname_implementation
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: UTF-8
+# encoding: ASCII-8BIT
C:/projects/faster-path/test/add_trailing_separator_test.rb:43:in `test_add_trailing_separator_against_pathname_implementation'
 PASS (0.00s) :: test_it_handles_nil
 PASS (0.00s) :: test_refines_pathname_add_trailing_separator
 FAIL (0.01s) :: test_add_trailing_separator_in_dosish_context
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: ASCII-8BIT
+# encoding: UTF-8
C:/projects/faster-path/test/add_trailing_separator_test.rb:23:in `test_add_trailing_separator_in_dosish_context'
 FAIL (0.01s) :: test_add_trailing_separator
--- expected
+++ actual
@@ -1,2 +1,2 @@
-# encoding: ASCII-8BIT
+# encoding: UTF-8
C:/projects/faster-path/test/add_trailing_separator_test.rb:12:in `test_add_trailing_separator'
 PASS (0.00s) :: test_nil_behaves_the_same
 FAIL (0.00s) :: test_refines_pathname_absolute?
Expected false to be truthy.
C:/projects/faster-path/test/refinements/absoulte_test.rb:14:in `test_refines_pathname_absolute?'
 PASS (0.00s) :: test_clean_conservative_dosish_stuff
 PASS (0.00s) :: test_clean_conservative_defaults
 FAIL (0.19s) :: test_it_calculates_the_correct_percentage
Expected 40..60 to include -8.9.
C:/projects/faster-path/test/pbench_test.rb:7:in `test_it_calculates_the_correct_percentage'
 PASS (0.00s) :: test_extname
 PASS (0.00s) :: test_extname_edge_cases
 PASS (0.00s) :: test_nil_inputs
 FAIL (0.01s) :: test_substitutability_of_rust_and_ruby_impls
--- expected
+++ actual
@@ -1,2 +1,2 @@
# encoding: ASCII-8BIT
C:/projects/faster-path/test/extname_test.rb:43:in `test_substitutability_of_rust_and_ruby_impls'
 PASS (0.00s) :: test_it_returns_similar_results_to_pathname_entries_as_strings
 FAIL (0.00s) :: test_it_knows_its_relativeness
Expected true to not be truthy.
C:/projects/faster-path/test/relative_test.rb:9:in `test_it_knows_its_relativeness'
 FAIL (0.00s) :: test_it_knows_its_relativeness_in_dos_like_drive_letters
Expected true to not be truthy.
C:/projects/faster-path/test/relative_test.rb:26:in `test_it_knows_its_relativeness_in_dos_like_drive_letters'
 PASS (0.00s) :: test_it_takes_nil_safely
 PASS (0.00s) :: test_refines_pathname_chop_basename
 PASS (0.00s) :: test_it_redefines_relative?
 PASS (0.00s) :: test_it_redefines_has_trailing_separator
 PASS (0.00s) :: test_it_redefines_absolute?
 PASS (0.00s) :: test_it_redefines_add_trailing_separator
 PASS (0.00s) :: test_it_does_not_redefine_directory?
 PASS (0.00s) :: test_it_redefines_chop_basename
 PASS (0.00s) :: test_plus
 PASS (0.00s) :: test_clean_aggressive_dosish_stuff
 PASS (0.00s) :: test_clean_aggresive_defaults
 PASS (0.00s) :: test_clean_aggressive_defaults
 PASS (0.00s) :: test_it_does_the_same_as_file_basename
 PASS (0.00s) :: test_it_creates_basename_correctly
 PASS (0.00s) :: test_nil_inputs
 PASS (0.00s) :: test_it_build_linked_library
Finished in 0.73281s
72 tests, 302 assertions, 23 failures, 0 errors, 2 skips
Updated 02/03/2017 06:41 1 Comments

[edn] Make EDN allocation more flexible


This is a general ticket tracking making our EDN representation more efficient. We have talked about interning in a variety of contexts (see, for example, #318) but expect to need to intern at the EDN level. So this ticket tracks making our EDN parser take a factory of some sort.

There’s a nice builder/allocator pattern for this in which I think is worth understanding: see

Updated 23/02/2017 23:36 2 Comments

[db] Implement rusqlite::types::FromSql for NamespacedKeyword


via @ncalexan

It’s possible to make let ident: NamespacedKeyword = row.get(0) just work, by implementing rusqlite::types::FromSql. However, that’s awkward because the type is in the edn crate and the trait is in the rusqlite trait. It might be worth newtyping this to be automatic, but we can do that as follow-up. This might just get better with your next ticket, too.

Updated 17/02/2017 22:04

Directly intern values from an input str slice


Each of our structs keeps an owned value.

To avoid excessive duplication, we wish to use Rc here to avoid having thousands of duplicated strings.

But even with InternSet and Rc, we end up creating garbage:

  • Parser sees a &str.
  • Parser creates a struct like Keyword, which clones the slice into a new String.
  • Parser or consumer looks up the Keyword (wrapped in an Rc) in InternSet, which drops the new keyword and returns the existing one.

It would be good to come up with a strategy for going straight from the &str to the Rc<Keyword>. This might be a factory method provided to the parser. It might be an extended EDN representation (edn::Value<InternedString>?). It might be a kind of Borrow hook.

It’s also possible that LLVM will optimize away that allocation… but I doubt it, particularly in the way we currently work (which means collecting all of these Keywords and getting rid of the duplicates later).

This ticket involves a fair amount of good Rust judgment, so it’s not suitable for beginners.

Updated 17/02/2017 17:31 1 Comments

[db] Abstract over user_version to allow database sharing


Mentat uses a single integer, the SQLite user version, to handle base storage schema versioning. At some point it might be useful to abstract this a little, allowing the use of a separate table (à la the “table table” in Firefox for iOS) or other mechanism. This would allow Mentat to cooperatively store data inside an existing SQLite database owned by a conventional consumer.

Updated 24/04/2017 18:14 1 Comments

[doc] Simplify the upsert resolution Wiki page to agree with the actual implementation


Now that #283 has landed, the documentation at is stale. Luckily, it’s stale ‘cuz the implementation is simpler than the algorithm description! This ticket is a marker for us to loop back to the Wiki and make sure what we actually do in the Rust implementation is accurately described in the Wiki.

See also #207, which might be addressed concurrently.

Updated 15/02/2017 17:09

[query] Implement simple aggregation


We can support simple aggregation in our queries simply by creating appropriate GROUP BY clauses (see #311) and projecting aggregate expressions directly in SQL.

The usual aggregate function symbols map directly to SQL aggregate functions:

(def aggregate-functions
  {:avg :avg
   :count :count
   :max :max
   :min :min
   :sum :total

and the ClojureScript implementation offers a lot of guidance.

Updated 24/04/2017 18:15

[core] Add transaction listener interface


This is the Rust equivalent of #61 and Datomic’s Transaction Report Queue. The idea is to push transaction reports to a queue (or invoke a callback, or …) to allow the embedding application to react to specific types of transactions. There are lots of ways to achieve this, all with different trade-offs.

If we want to push all the way across the line, we’ll need some form of transaction listener to push changes to the WebSocket clients.

This depends on #296.

Updated 24/04/2017 22:08 1 Comments

[core] Allow canceling/interrupting Mentat queries


One of the things @rnewman’s considered is how to allow canceling or interrupting Mentat queries. He concludes that the best way to achieve this is to farm queries to separate threads. That’s because SQLite doesn’t allow to cancel running queries, so the way to achieve this is to just kill the thread and let SQLite’s atomic commit mechanism handle clean-up, etc.

This ticket tracks experimenting with this approach, working through whatever communciation/synchronization primitives are necessary, whatever thread pooling is desired, etc.

Updated 24/04/2017 22:08 2 Comments

[query] Type code expansion


As we process a collection of patterns into a CC, we collect constraints. For example,

[?x ?a true]

yields a constraint like

ColumnConstraint::EqualsValue(d0_v, TypedValue::Boolean(true))

Some types (including boolean) share a SQLite value representation with other types — 1 could be a ref, an integer, an FTS string, or a boolean.

This ticket tracks expanding appropriate value_type_tag constraints out of a set of column constraints once all value constraints are known. We naturally must wait until all value constraints are known, because ?a might later be bound, which will eliminate the need to constrain value_type_tag.

It’s possible at this point for the space of acceptable type tags to not intersect: e.g., the query

[:find ?x :where
 [?x ?y true]
 [?z ?y ?x]]

where ?y must simultaneously be a ref-typed attribute and a boolean-typed attribute, which is impossible. This function should deduce that and call ConjoiningClauses.mark_known_empty.

See also #292.

Updated 27/04/2017 16:21

[query] Attribute derivation


Consider a query like

[:find ?x :where [?x ?a 15.5]]

What this query means, and the results it can produce, depend very much on the schema.

If the schema contains no attributes with :db/valueType :db.type/double (we don’t support float), then we know at algebrizing time that this query can return no results.

If the schema contains only one double attribute, then we know that ?a must be bound to that attribute if the query is to return results, and we can act as if ?a is a constant. This can allow us to generate a more efficient query plan.

If the schema contains several attributes that could match, then we can at least constrain the solution space.

This ticket tracks doing that when the algebrizer is a little more complete (after #243).

Updated 14/02/2017 17:53

Rust and other languages


以下紀錄 Rust 與其他程式語言間的互動,主要可以分成兩項:

  • 在 Rust 內使用其他程式語言的程式碼
  • 在其他程式語言內使用 Rust 的程式碼

對於 Rust 使用 C 程式碼,基本上就是用 extern 並定義好對應的函式名稱和型別。 對於 C 使用 Rust 程式碼,基本上就是用 Rust 這邊要用 #[no_mangle] 避免函式名稱的 mangling,並且界面要是針對 C 的型別。

雖然我有嘗試寫過 Rust 提供給 C 和 Python 使用,但是覺得手寫起來不太方便,目前有些看起來可能可以幫上忙,但還沒嘗試。

Python 這邊希望可以像 Cython 一樣提供比較方便的方式在 Setup Script 中使用,以便在專案中加入部份的 Rust 程式碼或是連結特定的 library,不知道這樣是否會運作良好。

  • #[no_mangle] + extern "C" => C ABI function without name mangling
  • #[no_mangle] + extern => Rust ABI function without name mangling
  • extern "C" => C ABI function with name mangling
  • extern => Rust ABI function with name mangling


  • Rust Book - Foreign Function Interface:
  • The Rust FFI Omnibus:
    • 在其他程式語言中使用 Rust 的不錯範例


    • 餵入 C/C++ header 檔,產生 Rust 這邊的 binding
    • 從 Rust 程式碼產生 C header
    • Erlang
    • Node.js
    • Python
    • Ruby
    • R


Updated 14/02/2017 08:41





  • LLVM Toolchain:
  • Mozilla:
  • Python:
  • Valgrind:
  • Ruby:
  • Lua:
  • Go:
Updated 13/02/2017 17:18 3 Comments

[query] Second algebrizing phase for bound variables in prepared statements


In the ClojureScript implementation of Mentat, we processed a query from EDN through to SQL results each time it was run.

This isn’t ideal for real-world use: we much prefer to use prepared statements, which allow us to parse a query once, plan it infrequently, and run it many times.

Nevertheless, in the ClojureScript implementation we did things right: external inputs (bound variables) were threaded into the mix after algebrizing, at the point of translation from query context to SQL, rather than being substituted before algebrizing.

At present, our Rust algebrizing phase (part-implemented!) deliberately ignores bound variables, just as in CLJS.

That allows algebrizing to be done once for each schema: a query can be ‘prepared’ once, and only needs to be re-prepared if the set of idents or attributes changes. Values that match any deduced type constraints can be substituted into the same prepared query. (Values that don’t match will cause the query to automatically return no results.)

For many uses of bound (:in) variables, this is fine: queries like

[:find ?person ?email
 :in $ ?name
 [?person :person/name ?name]
 [?person :person/email ?email]]

won’t benefit from further analysis at the point prior to execution when ?name is known. But queries like

[:find ?x ?z :in $ ?a :where [?x ?a ?z]]

certainly would — knowing ?a allows us to know both which table to query and the expected type of ?z when projected. A query planned with an unbound ?a must query all_datoms and project ?z’s value_type_tag to do runtime type projection, both of which are expensive.

We should do two things:

  • Implement some heuristic to decide whether a second phase of algebrizing is useful. Perhaps this is as simple as “there are :in variables whose types are unknown”, or perhaps “one of the :in variables names an attribute”.
  • Implement a second, cheap phase of algebrizing, and use it prior to query execution as appropriate.

Queries with no bound parameters, or that we’re happy not re-algebrizing, can be translated directly into SQLite statements for each connection. Other queries will generate a new SQL query each time (though obviously memoization is possible), and so the potential speedups of a more precise query formulation must be traded off against the loss of preparation. This might be a choice we leave to users, just as SQL databases give the option of keeping prepared statements.

I expect queries with certain kinds of :in parameters will always need a second phase of processing, or need to be entirely processed prior to execution. A query with a source var %foo needs to know which tables provide %foo.

Queries can also take a ?log variable representing the database transaction log. That’s used in a tx-ids or tx-data expression. It’s not clear to me how we’ll handle that yet.

This issue tracks figuring out this work when the first stage of the algebrizer is complete.

Updated 10/02/2017 02:06

[tests] Don't compile the test EDN into the test binary


Right now, I’m using include_str! to get the EDN fixtures into our tests. However, this means any change to the test data prompts a recompile, which is slow!

This ticket tracks updating the code around and to instead only include_str! a single file listing the test fixture filenames, and then dynamically reading the test fixture contents at test time.

That’ll make changes to the test list prompt a recompile, but otherwise make it fast to iterate on the test file itself.

Bonus points for using a snippet to expand the test list at compile time, generating functions for reach test file. That will produce better cargo test output than a single test that runs through all the test fixtures.

Updated 09/02/2017 23:59 1 Comments

[tx] Test that datom flags are written to SQL store correctly


This is follow-up to #267 and similar to #268. What we want is tests to ensure the datoms table gets the correct index_* flags set, based on the attribute flags. The datoms schema is around

To test this, use the existing transact code to transact assertions using some of the bootstrap attributes. Then use raw SQL to query the datoms table, following the examples in (You might add a helper there to produce EDN for the “full” or “raw” datoms, making this a little easier to test, and possibly helping later on when we want to compare other parts of the raw tables.)

The interesting cases are: - attribute is :db/index true => datom has index_avet = 1 - attribute is :db/valueType :db.type/ref => datom has index_vaet = 1 - attribute is :db/fulltext true => datom has index_fulltext = 1 Note: this can’t be tested yet, since fulltext datoms aren’t yet supported! - attribute is :db/unique :db.unique/value => datom has unique_value = 1

Check the positive and negative cases for each of those, and we’re good! This is a pretty good first bug, I think, since there’s lots of code similar to this in the tree already.

Updated 24/04/2017 22:15

[db] Optimize bulk insertions using fixed-size statement caches and base expansion


This ticket tracks making bulk insertions (like at more efficient. This is follow-up to

I don’t know the best way to do this, so there’s some experimentation needed. I can think of the following considerations:

  • average transaction size is > 1 assertion (since there’s always a :db/tx :db/txInstant MILLISECONDS datom)
  • most transactions will be small, say < 5 assertions
  • some transactions will be huge, say 10,000 or 100,000 assertions
  • SQLite will accept only a limited number of bindings per statement handle (usually 1000, but possibly as many 500,000 on some platforms!)
  • there’s a finite store of SQLite statement handles available

So we want to cache a certain number of insertion statements for some number of assertions, and then chunk our input into those insertion statements. If we had unlimited bindings per statement and unlimited statement handles, we might do a base-2 expansion and lazily cache insertions of 1, 2, 4, 8, 16, assertions. However, given our considerations, it might be better to cache insertions for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, (999 / bindings-per-insertion) assertions, effectively dividing the parameter space into “little” and “huge” regimes.

If we wanted to be really fancy, we could determine our caching strategy dynamically at initialization time, or we could generate different strategies at compile time for different hardware profiles (like most FFI implementations do).

This seems to me similar to “Montgomery ladders” for fast multiplication and “base phi expansions” for doing arithmetic in algebraic groups that have non-trivial computable morphisms, with some additional restrictions thrown in.

Updated 08/02/2017 21:08 2 Comments

[db] Don't collect as many intermediate data structures when looking up [a v] pairs


This is follow-up to

What’s happening is that we have a map [a v] -> e expressed as a composition of maps: - [a v] -> temporary search id - temporary search id -> e entid from database

The latter map must be chunked into a number of sub-queries because SQLite allows to bind only a limited number of variables.

The reviewer correctly points out that we don’t need to collect as many intermediate data structures. We could do this as iteration and a mutable map, although I think I prefer a big fold. A “fold-in-place” is best expressed as a for loop in Rust (per the documentation), which is what the reviewer asks for.

This ticket isn’t quite ready, because the [a v] lookup code is likely to change a little with #184.

Updated 08/02/2017 20:03

Improve pattern for finite-length iterators in flat_map expansions


There are a bunch of places where we want to turn a collection of tuples (like Vec<(A, B, C)>) into a flat iterator, like A1, B1, C1, A2, B2, C2, .... The pattern we’re using right now is to combine flat_map and std::iter::once, like: rust collection.flat_map(|(a, b)| once(a).chain(once(b)))

That avoids heap allocation in favour of stack allocation, but gets pretty gnarly when the tuple length increases. For example: - -

This ticket tracks doing something better. Perhaps a few helper functions (like once_2 and once_3?), or impl (A, B) { fn once(...) { } }, or a macro to at least make the callsite easier to read.

This is probably not a good first bug unless you’re strong in Rust, because some taste is required to determine what’s better than the pattern we have now.

Updated 11/02/2017 10:37 1 Comments

[query] Parse more complex query patterns


After #211 we support queries consisting of one or more data patterns: [?x :foo/bar ?y]. We need to grow more:

  • [x] not
  • [x] not-join
  • [x] and
  • [x] or
  • [x] or-join
  • [ ] Predicate expressions (part done)
  • [ ] Rule expressions

This is unlikely to be a good first bug.

Updated 28/04/2017 09:47

[tx] Allow setting the default partition for string temp IDs


For #184, I’m going to try implementing edn::Value::Text string temp IDs. By default, these will allocate in the :db.part/user partition.

This follows Datomic, but way down at the bottom of it shows a way to configure this “default partition”. This ticket tracks letting this be configured. My initial implementation suggestion is to make this a datom in the store with a cached materialized view; might as well be consistent. Something like: edn [:db/default :db.default/partition :db.part/user] which reads “the DB defaults entity has default partition set to the user partition”.

Updated 02/02/2017 23:45 1 Comments

[edn] Add EDN differ


Similar to #195, it gets really tiresome trying to visually diff to subtly different edn::Value instances. This ticket tracks implementing some structured diff algorithm. Even just pretty printing and doing a text diff on the two outputs would be better than the madness I’m doing to compare right now!

Updated 24/02/2017 09:07 5 Comments

Rust compiler-explorer produce error with -g or no output with -O


When I use the below valid code written in Rust, I see assembly generated - it’s all good. I use the default GUI options of compiler-explorer and no compiler options. However, if I use the compiler option:

  • -g, the result is <Compilation failed>.
  • -O, the result is 7 empty lines.

In both cases, I expect to see assembly code.

fn main() {
    let mut sum: u32 = 0;
    let (from, to) = (0, 500);

    for i in {
        sum += i;

    println!("{}", sum) // 124750

Updated 20/02/2017 15:47 10 Comments

[meta] Full Datalog type support (instants, UUIDs, URIs)


Over in #170, I simplified things by not including placeholders for Mentat value types I wasn’t going to support at the start: :db.type/instant, :db.type/uuid, and :db.type/uri. (We might consider supporting JSON and a binary blob type like :db.type/bytes as well.)

This ticket tracks supporting these types. Off the top, this will mean: 1. Adding new DB_TYPE_* values to the bootstrapper; 1. Bumping the SQL schema to accommodate the new idents; 1. Adding new ValueType cases; 1. Adding corresponding TypedValue cases; 1. Implementing the conversions to and from SQL; 1. testing the new types in the transactor and potentially in the query engine as well.

Updated 24/04/2017 22:14 2 Comments

[tx] Prepare and cache SQL statements for :db/add, :db/retract, etc


The set of SQL statements for adding and retracting datoms is limited. We will achieve significant performance benefits by caching the compiled statements and merely re-binding them.

This ticket tracks using the rusqlite statement cache (if it’s appropriate), or implementing our own statement cache for the transactor statements. It’s worth noting that the transactor will be single-connection-on-single-thread, so the cache will not need to be sophisticated.

Updated 21/04/2017 18:24 2 Comments

[tx] Convert Clojure tests into Rust tests


The Clojure implementation has an extensive correctness test suite. Most of the tests look like: 1. install test schema; 1. transact a few assertions; 1. assert that datoms, transactions, or fulltext values table has following contents/new rows. Since our SQL schema is identical to the Clojure implementation, it should be easy to move the input and output data into .edn files and write a generic test harness to do those steps.

It might be finicky to get the transaction IDs, timestamps, and other ephemeral data correct. I wonder if we can rewrite :db/tx and :db/txInstant in controlled locations to get what we want? (This could require reader syntax like #value [:db/tx] or similar.)

Updated 01/02/2017 01:08 1 Comments

[tx] Handle _reverse attribute assertions (back references)


An assertion of the form edn [[:db/add IDENT :db/_ref-attr ENTID]] is equivalent to the assertion edn [[:db/add ENTID :db/ref-attr IDENT]] That is, this allows to add back-references. It’s particularly useful using the map shorthand of #180: for example, one can add and install a new attribute into the :db.part/db using: edn [{:db/id #id-literal[-1] :db/ident :my/ident :db/valueType :db.type/long :db.install/_attribute :db.part/db}]

Updated 24/04/2017 22:13 4 Comments

[tx] Implement :db/tx, :db/txInstant in transactor


I just didn’t get to tracking transaction IDs at all in #170. This might require #184, depending on how it’s implemented, but there are simpler ways to achieve it.

Each transaction gets an extra assertion of the form: edn [[:db/add :db/tx :db/txInstant INSTANT]] where :db/tx is the next sequential transaction ID (in the :db.part/tx partition), and INSTANT is the transactors local time. (We’ll ignore clock drift for now – at our own peril.)

Updated 24/04/2017 22:13

Rust backlog


We’re starting on a Rust rewrite of the existing Clojure codebase, using the rust branch.

With an eye to parallelism, the following will get us to parity and a bit further, at which point we’ll start on the list of more speculative work.

Large portions of this will be easier than the first time around: the ClojureScript implementation required lots of deep thinking that won’t need to be repeated.

Starter Projects:

  • [x] Learning Rust. This is a totally valid use of time.
  • [x] [@bgrins] Get Rust builds and tests working in CI.
    • [x] Land initial code (
    • [x] Update readme with build and test instructions (
  • [x] [@jsantell] Begin SQLite wrapper
    • [x] Survey of Rust SQLite libraries.
    • [x] [@bgrins] dependency
    • [x] open a DB at a path and run the default pragmas
    • [x] exec/query abstractions
    • [x] How do we update and deliver SQLite?
  • [x] [@joewalker] Build an EDN parser.
  • [ ] [@jsantell] Build a Rust node native module. End goal: call a Rust function from JS.
  • [ ] Build a Rust library for Android. Call it via JNI.
  • [ ] Build a Rust library for iOS. Sign it. Call it from Swift.
  • [x] [@bgrins] Command-line tool. Right now it wouldn’t do much, but we could put the skeleton in place and do something like print user_version for a DB path…
    • [x] Evaluate exposing as a crate within a subfolder
  • [x] [@bgrins] Figure out an out-of-band test solution for Rust. In-code tests are great, but they’re going to get really verbose when we have piles of SQL generation tests. (see
  • [ ] SQLite generator libraries a la honeysql: #273
    • [ ] Research.
    • [ ] Import and use.
  • [ ] Add a logger, something like

Moving on:

  • [x] Core (the big blocker for much of the rest):
    • [x] Core valueType work: type codes and translation, value conversion.
    • [x] Core schema work. Basic Rust type definitions. Traits for translations.
    • [x] SQLite wrapper: opener, part reading, etc.
    • [x] Bootstrapper.
    • [x] Conn + DB.
  • [ ] Querying:
    • [x] [@rnewman] Datalog query parser: port from DataScript. #160.
    • [x] Parsed query (FindRel) -> abstract query (ConjoiningClauses). Porting datomish.query.
      • This will involve building up much of ‘Core’, above, brick by brick.
    • [x] Abstract query -> SQL.
    • [x] Abstract query -> projectors.
    • [ ] Querying history and logs.
  • [ ] Transactor:
    • [x] Transactor expansion: schema + shorthand -> datoms.
    • [x] Transactor writer.
    • [ ] Transactor loop.
    • [x] Prepared statements/statement caching.
    • [ ] Excision.
    • [ ] noHistory.
  • [ ] Schema management module (porting).
  • [ ] Reader pool (query executor).
    • [ ] Query interruption via closing connection.
    • [ ] Parallel reads.
    • [ ] Connection affinity to exploit sqlite page cache.
    • [ ] Prepared statements for repeated queries.
      • [ ] Invalidation when schema contents change.
      • [ ] Handling of query transformation in the presence of bound values, which might change between uses of the prepared statement.
  • [ ] Tooling. We can invest in this as much as we want; it’s leverage.
    • [ ] A command-line query tool, just like sqlite3.
    • [ ] A REPL that’s capable of doing programmatic read/write tasks. JS bindings? Rust REPL?
    • [ ] A developer tool/embedded server: showing query translations and plans, timing query execution, exploring schema…
    • [ ] GraphQL interface generation from Datomish schema.
    • [ ] Bulk loading and import. Make this independent by expanding into a datom stream wrt a given schema.
    • [ ] Export: textual/EDN, RDF?
  • [ ] Transaction listeners over a socket.
    • [ ] Log dumper standalone tool. Watch writes happening as they occur.
  • [ ] Syncing:
    • [ ] Transaction log replication (out) and replay (in).
      • [ ] Code and algorithms exist.
      • [ ] Define how IDs are mapped in and out, and how those might be efficiently represented in chunks.
      • [ ] Handle application of noHistory attributes.
      • [ ] Handle remote excision.
    • [ ] Multi-master transaction log handling (syncing).
Updated 13/04/2017 02:28 2 Comments

Generar 4 convenios de colaboración con personas/organizaciones interesadas en proyectos basados en RUST


Objetivo a alcanzar:

Generar 4 convenios de colaboración con las personas/organizaciones interesadas en trabajar con nuevos proyectos basados en RUST para el 23 / Ene / 2017

Documentos relacionados:

Wiki oficial del equipo RUST de Mozilla México:

  • Enlaces: Mozilla Activate - RUST: Documentación oficial de RUST: Proyecto Servo en GitHub: Juega con RUST: RUST by example: Framework Web para Rust: Meetup de la comunidad Rust MX: Introducción a RUST por Mario García:


  • Responsable: Elesban Landero @tuxlan
  • Accountable (quien rinde cuentas): Abiel Parra @Heeled-Jim
  • Supporter (apoyo): Yuli Reyna @yulireyji Edgardo Ríos @edgardorios Alex Fuser @alexfuser David Salgado @KuronekoKat Rodrigo Moreno @rodmoreno Uriel Jurado @BoSicle Abiel Parra @Heeled-Jim
  • Consulted (consultado): Mario Martínez @aio00, Jorge Díaz @jdiazmx
  • Informed (informado): Guillermo @deimidis Rubén @nukeador
    Comunidad @mozillamexico

Lista de items:

  • [ ] Proponer el primer plan de acción para la creación de proyectos basados en Rust para el viernes 13 / Ene / 2017 (R: @tuxlan, @Heeled-Jim)
  • [ ] Generar un boletín de difusión en conjunto con la comunidad Rust CDMX con los proyectos activos y organizaciones que estén trabajando con RUST para el lunes 16 / Ene / 2017 (R: @tuxlan)
  • [ ] Generar los primeros 4 convenios de colaboración (uno relacionado con Mozilla Lab Science) para proyectos basados en RUST para el viernes 20 / Ene / 2017 (R: @tuxlan, @BoSicle y @Heeled-Jim)
  • [ ] Añadir a la wiki una sección con los proyectos activos y un histórico para el lunes 23 / Ene / 2017 (R: @edgardorios)

Issues relacionados:

Updated 09/02/2017 05:28 3 Comments

Easy methods to start learning Rust with


If you see a method you would like to implement open a new issue and get started with the outline below.

Good starter methods

  • <strike>Pathname#has_trailing_separator?</strike>
  • <strike>Pathname#add_trailing_separator</strike>
  • Pathname#del_trailing_separator
  • <strike>Pathname#extname</strike>


  • Write Rust implementation of Pathname to FasterPath
  • Write refinements and monkey-patches for Pathname and map implementation to FasterPath
  • Write tests proving FasterPath operates the same as Pathname
  • Write benchmarks in test/benches and compare the performance. And make a copy of the tests for test/pbench in the pbench_suite.rb file (the outline is in the file).
  • Also compare both the original to the new method to verify compatibility. (See chop_basename A,B,C tests)
Updated 01/03/2017 23:03 4 Comments

Implement Pathname#join


Rust seems to have it’s own path joining methods. This “may” do exactly what we want. - [ ] Write Rust implementation of Pathname#join to FasterPath.join - [ ] Write refinements and monkey-patches for Pathname#join and map implementation to - [ ] Write tests proving FasterPath.join operates the same as Pathname#join - [ ] Write benchmarks in test/benches and compare the performance. - [ ] Copy benchmarks to test/pbench/pbench_suite.rb

Updated 02/03/2017 06:39 2 Comments

Fork me on GitHub