Notes from tarpit reading: http://shaffner.us/cs/papers/tarpit.pdf

OOP: intentionally couple state to related behavior

No Silver Bullet, what makes software hard: Complexity, Conformity, Changeability, Invisibility

Dijkstra: testing is hopelessly inadequate… it can be used very effectively to show the presence of bugs but never to show their absence.

State hinders Informal Reasoning, reasoning about expected behavior from the inside, with knowledge of internal code.

Control Complexity: complexity due to having to be concerned about the order of events. Languages with explicit flow control make you think about this.

a = b + 3 c = d + 2 e = f * 4

No reason for this flow, but programmer has to over-specify it (and compilers have to go to lengths to know that the order can be safely ignored). Accidental complexity:

  1. artificial, totally ignorable ordering imposed on programmer
  2. compiler work is done to optimize it away

Note that these two forms of complexity only apply given the assumption that the above code is for an imperative language with guarantees about order of execution; Oz is an example of a programming language that didn’t specify this.

“Running a test in the presence of concurrency with a known initial state and set of inputs tells you nothing at all about what will happen the next time you run that very same test with the very same inputs and the very same starting state… and things can’t really get any worse than that.”

CLOS: Common Lisp Object System, with multiple dispatch (methods can specialize on any/all required arguments, unlike classic OOP single dispatch).

Problems with OOP encapsulation:

  • access to state can still be spread all over the place, e.g. in presence of inheritance
  • encapsulation strongly biased toward single-object constraints; not a lot of help in coordinating multiple object states

Identity and State

Object identity: in OOP, each object is considered uniquely identifiable regardless of attributes. This is intensional identity (as opposed to extensional where objects are the same if their attributes are). Intensional identity opposes relational algebra view of the world.

But OOP is complicated when mutability isn’t needed, and you add concepts like Value Objects where equality is based on values and not identity (in other words, it brings back extensional identity).


Immutable small objects whose equality is not based on identity. In int is an int forever, but objects can be mutated, so if you want an object that acts like, say, an int, you want a value object.

TL;DR C++ has copy-by-value because C does, hence both support value objects. Java has no native support for value objects, but you can get a functionally similar/equivalent thing by passing around references to immutable objects (VALJO, VALue Java Object, where all attributes of the obj are final and doesn’t contain other objects w mutable state).

Object Identity exists because state exists, and is a source of error due to mental switching between the meaning of equality. (TODO: why isn’t this a problem in Clojure? You have to mentally switch between values and atoms… how is that any different?).

Hmm I feel like there’s something I’m not quite grasping about what people mean when they say state. Seems like everything has state? I dunno.

Summary: conventional OOP suffers from state-derived and control-derived complexity.

Functional Languages

Pure: Haskell. Impure: ML, in that it advocates avoiding but still permits state.

Referential transparency: an expression (e.g. function calls) are replaceable by a value… nope, i guess it’s just that a function called with the same args will always return the same value.


Still an implicit left-to-right sequencing of operations, but fortunately control flow like loops are avoided in favor of fold/map.

Kinds of state

When most people talk about state they really mean mutable state.

Stateful methods can be replaced by functions where the state is passed in, and a new version of the state is returned, with the expectation that the new state must be passed in again to future calls (because referential transparency means the same function invocation always returns the same thing).

This paper seems to distinguish b/w procedures and functions based on a procedure having state… it runs and can manipulate inner state. But a functional only returns values based on what was passed in and has new inner state.

BUT, this is just using FP to simulate state. In principle, you could build functional programs by just passing in the global god state of the app into every function and just chaining it along, which brings back the problem of single pool of global variables. Ref transparency++, ease of reasoning–. But this is an extreme example.

Argument: state increases modularity:

“Working within a stateful framework it is possible to add state to any component without adjusting the components which invoke it.”

With FP, you have ripple effects where adding “state” to a function means that state needs to be provided by a caller of a caller of a caller, etc, e.g. adding that extra state parameter.

It’s a tradeoff between hiding state and FP where you know exactly what will happen.

But in stateful PLs, you never know if there will be side-effects; you have to inspect all code to really know.

“As with the discipline of (static) typing, it is trading a one-off up-front cost for continuing future gains and safety (“one-off” because each piece of code is written once but is read, reasoned about and tested on a continuing basis).”

In other words, it’s convenient in the moment to have state (the one-off time you write that stateful code), but for all future readers/reasoners, they have to deal with the fallout of losing the guarantee of statelessness.

Monads are kind of a way to have the cake and it it too, but makes it possible to create a stateful sub-language within Haskell, even if technically it’s properly typed as such. But still, monads have been insufficient in helping widespread FP adoption.

Logic Programming

State your axioms and desired goals, let the system build the formal proof for each solution. Prolog is seminal logic PL.

“It is worth noting that a single Prolog program can be both correct when read in the first way, and incorrect (for example due to non-termination) when read in the second.”

This is because Prolog axioms aren’t read as purely logical axioms but are applied sequentially, which is why the order of axioms can affect the outcome.

Control issues

LTR and top to bottom dependencies exist. Also, extra-logical features such as cuts (which prevent a restriction of control flow, presumably to prevent non-termination) add complexity.

Oz gives you flexibility of control rather than depth-first prolog, but rather than sprinkling these into the code, it’s at a separate level; in other words, the way you execute code is configurable, not within the code itself (a contamination of control complexity).

Classifying state:

Goal: determine origins of state, hope that most state turns out to be accidental.

All data is either directly provided to system (input) or derived.

Derived data is either immutable (used only for display) or mutable (because requirements specify user should update that data)

But just because all user input data is essential does not mean it must result in essential state. If it’s avoidable, it’s accidental state.

Input Data:

  • there’s a possibility of referring to that data in the future: essential state
  • there is no possibility (e.g. it’s used to cause some side effect but then can be forgotten)

Essential Derived Data - immutable

Always rederivable, accidental state if stored.

Essential Derived Data - mutable

I don’t understand this.

Accidental Derived Data

Ideal world:

No caches, no stores of derived calculations of any kind. Result: all state is visible to the user (or tester) of the system, since (disallowed) caching is the main source of hidden state. If you’re not caching, then everything you calculate is presented to the user.


Control is accidental since it’s rare if ever that requirements are a view of execution.

Concurrency is accidental; assuming zero time computation, user doesn’t care whether something happens in sequence or parallel.

Real world

  • state is required because most systems have state as part of their true essence (wtf does this mean?)
  • control is accidental, but practically (and for efficiency purposes) it is needed, same goes w state (caching and what not)

Formal Specification

  • property-based: what is required rather than how. Includes algebraic approaches such as Larch and OBJ
  • model-based / state-based: construct a model, often stateful, and specify how it must behave. Implies a stateful approach for how to solve the problem.

Sometimes ideal world approach (no accidental state, derive everything) does not best model program. Example: derived data is dependent on both series of user inputs over time, AND its own previous values. In such a case it can help to maintain accidental state (I don’t get this… why is this different than the conveniences of storing derived state that isn’t strictly historical?). Example: position of computer controlled opponent in interactive game: technically position is derivable as f(initialPosition, allMovementsSince), but “this is not the way it is most naturally expressed”. Go fuck yourself and your loose definition of what is natural.

Required Accidental Complexity:

  • perf
    • avoid explicit management of accidental state; instead: simply declare what accidental state should be use, leave to separate infrastructure to maintain.
  • ease of expression: (e.g. position of computer controlled opponent)
    • solution: pretend user is typing in this derived state, i.e. pretend that it is essential input

Relational Model

  • structure of data
  • manipulating data
  • maintaining integrity and consistency of state
  • insistence on separation b/w logical and physical layers of system

Data independence: app / logical model is separate from the data is actually stored

Structure: use of relations to represent all data Manipulation: a means to specify derived data Integrity: constraints Data Independence: clear separation is enforced b/w logical data and physical representation

Base Relations: raw tables Derived Relations: Views: defined in terms of other relations

Access path independence:

Relational structuring allows you to defer access paths (how the data will be queried, join, etc). Before relational model, you had to decide up front, e.g, whether employees would live inside top level departments, or departments within top level employees. This is the hierarchical approach. The network approach is a little better in that you can add cycles but in the end you’re still defining the primary retrieval requirements up front at the expense of not knowing what secondary/future retrieval requirements you’ll have. Again, joining is the canonical example.

OOP and XML suffer same hierarchical problems. Nesting. Who owns what. etc.

Manipulation: relational algebra

Restrict: unary operation for selecting subset (where clause?) Project: unary op which creates a new relation w various attributes removed (not added?) Product: cartesian product of tables (e.g. SELECT FROM foo, wat;) Union: binary operation, creates relation w all records in either arg relation Intersection: binary operation, creates a relation consisting of all records in both Difference: xor Join: binary operation construction all possible records that result from identical attrs Divide: ternary operation returning all records of first arg with occur in second arg associated w each record of 3rd arg (wat?

Functional Relational Programming

All essential state comes in the form of relations, and essential logic is expressed using relational algebra extended w pure user defined functions

Step 1, specify each of the following

Essential State: relational definition of stateful components Essential Logic: Derived-relation defitions, integrity constraints, and pure functions Accidental State and Control: declarative specification of a set of perf optimizations for the system Other: user and system interfaces for the outside world

Essential state

Relations, tables, columns (the schema, not actual rows/records) Data should only be considered essential if directly input by user.

Essential Logic

Pure functions about data transformation, set of db constraints.

Note that we ignore “denormalization for perf” for now because we’re just talking about formal specifications; the physical storage may or may not mirror what’s being specified here.

Accidental State and Control

Specify 1) what state should exist, 2) what physical storage mechanism used

  • state-related hint: e.g. some derived-relvar should be stored rather than recalculated
  • second kind of state-related hint: infrequently used subset of relvar should be stored elsewhere (e.g. partitioning a table)

Control side:

  • tweaking the evaluator


Declarative lets infrastructure optimize for you, e.g. avoid relational intersection if it can be determined that two groups are mutually exclusive, not possible/easy with imperative.

Normalized relational everything: avoids subjective bias about data access paths. OOP and XML generally force you to do the opposite, choosing nestings ahead of time and other things.

Control is avoided relational approach (think of SQL; no order of evaluation; this is intentional).

Explicit parallelism is avoided, but allows for possibility for separated accidental control if required; whether it’s parallel or not shouldn’t matter to anyone other than implementor, i.e. if it really improves things, it’ll be parallel, but functionally it’s the same interface for infrastructure consumers.

Code Volume

Focus on true essentials avoids accidental complexity.

Data Abstraction

Creation of compound data types; to be avoided. Why:

  • Subjective: like baking in data paths in OOP/XML-ish representation of data, is brittle to future use cases. Pre-existing bias will force future use cases into inappropriate reuse of pre-established biased structures. (I like this; this is a source of refactors, when you know what you’re doing is gross because of some new use case)

  • Data Hiding: constructing giant objects often causes unneeded, irrelevant data to be supplied to function, and which data actually gets used is hidden at call site, hurts informal reasoning and testability. Avoiding composite objects helps avoiding this problem.

FRP (func rel) opens door to

  • perf (decided by infrastructure)
  • different dev teams focusing on different components (by components we mean accidental vs essential vs interfacing)… this arg seems weak, or i don’t really understand it

Allowed Types

Can create limited types for essential state/logic components:

  • disjoint union / enumeration types
  • NO product types (types w subsidiary components)


Algebraic data type is a kind of composite type; type formed by combining other types. Product types (tuples and records), and sum types (disjoint unions or variant types). So you can have things like

Action = UserClickEvent | UserDragEvent | Blah

This is a sum type; the total values an Action could be are the total possible values of its variants, summed.

A Product type is, say, a type, e.g. (Int, String), where the total possible values are all the possible values of variants, multiplied (hence product).

Why sum and not product? because Sums don’t add new data types, really, they just categorize for pattern matching and other things. Whereas products create new compound datatypes.

Example app

Derived internal relations:

  • RoomInfo, extend(Room, (roomSize = width*breadth))
    • so it’s a room info with another type
  • Acceptance takes accepted Decisions and strips away accepted bool. So an Acceptance is a Decision without an accepted flag. I like this because we’re keeping the domain simple; if we’re dealing with Acceptances, we don’t have to worry that it’s an acceptance that’s not an acceptance; the domain is constrained properly.
  • Rejection is the opposite, but has the same attrs

Accidental state and control

This part is interesting because it suggests that defining relations (like you do when designing/committing a DB schema) is a premature accidental complexity. This paper suggests that you take your essential state types and hint that some of them should be cached. In relational databases, if you CREATE TABLE, you’re creating a cache. I guess this should be obvious. But the thing to note is that what this paper is suggesting is that there is a level above this at which we should be thinking. Everything below is accidental. Whether you CREATE TABLE or recompute on the fly is accidental. The user doesn’t care.

So, hints:

declare store PropertyInfo : create a cache / table for PropertyInfo rather than re-calc

declare store shared Room Floor : denormalize Room and Floor into shared storage structure (hmm, why? is this a join table?)

declare store separate Property (photo) : split out photo from other properties (perf hint).

TODO: read Kow79

Simple Made Easy

Classes as namespaces = bad.

Syntax is inferior to data.

Switching/pattern matching allegedly complects multiple pairs of who’s going to do something and what happens… how is this different than multi-methods?

variables complect value and time (they are state i guess)

for loops complecting by explicitly specifying how to do something.

folds still complect because they go from left to right…

Polymorphism a la cart, via Clojure protocols. 1) define data structures, 2) definitions of sets of functions, and 3) connect them together.

Favor declarative Prolog-ish logic programming to littering conditionals all over the place.

Resource contention is inherent complexity, not your fault.


Half-open connections


TL;DR: if you don’t write, you have no way of knowing whether your connection is still alive.

Does setTimeout wait til until the end of the event loop to start ticking?

No, it doesn’t:

  var endOfEventLoop;

  setTimeout(function() {
    console.log("" + (+new Date() - endOfEventLoop) + " ms after end of event loop");
  }, 1000);

  var now = +new Date();
  while((+ new Date()) - now < 900) { /* spin for 900ms */ }
  endOfEventLoop = +new Date();


  101ms later

Why is this important?

Because if you’re using magic numbers to, say, open a popup after some animation has occurred, you’re opening yourself up to disaster / timing errors if, after setting your timeout, a lot of slow computation / rendering logic eats up a chunk of that timer.

Primary Key

A primary key is just a column with the follow constraints:


Note: you can have multiple NOT NULL / UNIQUE columns, but only one can be marked as PRIMARY.

observeOn vs subscribeOn


As shown in this illustration, the SubscribeOn operator designates which thread the Observable will begin operating on, no matter at what point in the chain of operators that operator is called. ObserveOn, on the other hand, affects the thread that the Observable will use below where that operator appears. For this reason, you may call ObserveOn multiple times at various points during the chain of Observable operators in order to change on which threads certain of those operators operate.

Roth IRA

“IRA” stannds for Individual Retirement Account.


Roth IRA contributions are not taxed at the time you contribute the funds to your Roth IRA. However, your contributions come from post-tax income. You pay taxes on your income today, but not in the future.

What does post-tax income mean? If I make 80k pre-tax, let’s say my post-tax income is 60k. If I want to put $5k into a Roth IRA, then I’m down to 75k and 55k. OH I get it… I think. The amount the government takes from me grows as I make more money. I’d benefit me if I were able to put 5k into Roth before I was taxed, so that then my taxable income would be 75k, and I’d have to pay the gov less than if I were taxed based on 80k.

Random acts of optimization



TODO: figure out how to get your tmux / bash setup working. Shouldn’t be rocket science. Goal: get irb and psql working.


Object-relational database management system. Postgres is this. MySQL is not; MySQL is just RDBMS.

An object-relational database (ORD), or object-relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data-types and methods.

An Object Database leverages pointers over joins for collecting data, where as a the relational approach leverages foreign keys, denormalizing, and storing everything in tabular format. Pros to Object Database include no mental model mismatch between your programming model (often OOP) and database storage; create an obj, modify an obj, save an obj – no need to figure out how to express your object in tabular format.

Disadvantages are that any sort of reporting / querying needs to be programmed in, whereas relational databases follow set/relational theory and if you’ve got things stored in a tabular format you have way more flexibility to modify / query your database in the future.


Developed at Berkeley as POSTGRES, pioneered many concepts later adopted by commercial databases. Sponsored by DARPA. Didn’t have SQL (btw SQL has been around since the 70s). SQL added in Postgres95, became PostgreSQL.

Comes with a bunch of binaries:

postgres the server; accepts connection and then starts a worker process to handle that connection. (Can verify this; spinning up a rails console creates one more postgres worker instance).

Thus, the master server process is always running, waiting for client connections

psql shell looks like mydb=> for regular users, mydb=# for superuser.

Commands prefixed with \h are not SQL; everything else is SQL.

A relation = mathemtical term for table.


All data sets represented as tuples, grouped into relations.

“A relation is a data structure which consists of a heading and an unordered set of tuples which share the same type,”

Deviations of SQL from relational model


  • SQL allows duplicate rows; relational model does not. Practically though this is avoided by auto-incrementing.

SQL apparently also allows anonymous columns, duplicate rows, etc, which make things impossible to reference due to indistinguishability.

SQL includes “NULL” to imply missing data; comparison of NULL with itself is not true, but NULL. (Comparison of anything with NULL yields NULL; it means unknown, not determinable). And hence it’s a form of three-valued logic, rather than just boolean.

Law of exclude middle: It states that for any proposition, either that proposition is true, or its negation is true.

Relational model depends on this law, but SQL does not, since it allows for NULL. Apparently Codd (relational model inventor) eventually suggested a 4-valued logic, probably to differentiate NULL from UNKNOWN, only to have a bunch of smug guys suggesting 19 or even 21-valued logic. WAT. So Postgres just stuck w 3 valued logic.

SQL also uses NULL for other things than value unknown: the sum of an empty set is NULL.

Rows are grouped into tables/relations. Relations grouped into database. Databases managed by a single Postgres server instance is a cluster.


SELECT * FROM wats, foos;

Gives you a combination

 name | name | wat
 foo1 | bar1 | wat1
 foo1 | bar2 | wat2
 foo2 | bar1 | wat1
 foo2 | bar2 | wat2
 foo3 | bar1 | wat1
 foo3 | bar2 | wat2

Or even

 name | name | wat  | bazname
 foo1 | bar1 | wat1 | baz1
 foo1 | bar1 | wat1 | baz2
 foo1 | bar2 | wat2 | baz1
 foo1 | bar2 | wat2 | baz2
 foo2 | bar1 | wat1 | baz1
 foo2 | bar1 | wat1 | baz2
 foo2 | bar2 | wat2 | baz1
 foo2 | bar2 | wat2 | baz2
 foo3 | bar1 | wat1 | baz1
 foo3 | bar1 | wat1 | baz2
 foo3 | bar2 | wat2 | baz1
 foo3 | bar2 | wat2 | baz2

If any of those tables had zero rows then zero rows would be selected even if there’s data in the other tables.

It is widely considered good style to qualify all column names in a join query, so that the query won’t fail if a duplicate column name is later added to one of the tables.

You could also rewrite the first one as

SELECT * FROM foos INNER JOIN bars ON true;

So selecting * seems to select from all relations/tables involved.

Aggregate functions can’t be used in WHERE clauses, since WHERE clauses are limit the input rows, recursive infinite etc blah blah.

SELECT city FROM weather WHERE temp_lo = max(temp_lo);

You can change this to a subquery

SELECT city FROM weather
    WHERE temp_lo = (SELECT max(temp_lo) FROM weather);

Aggregates play nicely with GROUP BY clauses: here’s how you’d count many rows had the same values.

SELECT name, count(*) FROM bars GROUP BY name;

HAVING is like a WHERE clause for grouping / aggregate functions. Specifically, WHERE filters the input rows, and HAVING filters the results after aggregation has taken place:


Referential integrity

Making sure foreign keys point to objects in the table.

Window functions

Similar to GROUP BY, but instead preserves each input row in the output set rather than coalescing them down to just one row; in other words, instead of boiling down to a single answer per group, window functions add that answer to as a column to each of the input rows.

SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;

Whoa this is pretty awesome:

SELECT depname, empno, salary,
       rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;

“For each row, the window function is computed across the rows that fall into the same partition as the current row.”

Kinda like how aggregate functions are computed using rows with the same GROUP BY value.

There partitions and there are window frames…

A window frame is a subset of a partition (or the whole partition itself). Many window functions use window frame rows, though some use partition rows. By default, if ORDER BY is supplied to the over clause, not the whole query, then the window frame consists of the all rows from start of the partition through to current row, plus any following rows that have same ORDER BY value…


There must be some good reason for this. Can’t think of it now.

SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;

But the thing to keep in mind I guess is that the ORDER BY here is just another way to configure the partition + window frame. Maybe that’s the way to think of it: OVER clauses are used to specify both partitions and window frames all in one shot. PARTITION BY is one way, that specifies a partition where window frame is the partition, and ORDER BY is another, where the PARTITION is all the rows (filtered by WHERE) and window set is from start to current row.

Window functions can only be used in the SELECT list and in ORDER BY. I guess that makes sense. Can’t use them in GROUP BY, HAVING, or WHERE, since they logically execute after those things.

Aggregate expressions

---                   direct arguments                   aggregated arguments
aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) [ FILTER ...

The argument expressions preceding WITHIN GROUP, if any, are called direct arguments to distinguish them from the aggregated arguments listed in the order_by_clause. Unlike normal aggregate arguments, direct arguments are evaluated only once per aggregate call, not once per input row.

So the stuff in WITHIN GROUP(...) are evaluated once per input row. So ideally we should minimize the stuff we put in there.

This means that they (direct arguments) can contain variables only if those variables are grouped by GROUP BY;this restriction is the same as if the direct arguments were not inside an aggregate expression at all.

SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;

0.5 is a direct argument; it makes no sense for it to be substituted with a value that varies across rows.

So for example you could do:

SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY num) from nums;

percentile_disc is an aggregate function. It’s going to spit out values that are the result of coalescing rows. It wouldn’t make sense for me to try and add the column name to this result, because the percentile_disc result is based on coalesced rows:

SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY num) from nums;
ERROR:  column "nums.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY num...

The only way that’d make sense would be to GROUP BY name so that the aggregate function is only being applied to groups grouped by name rather than all of the input rows at once:

SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY num) from nums GROUP BY name;
  name  | percentile_disc
 alex   |               3
 justin |               2
(2 rows)

Wow this stuff is weird to talk about.

Note that count is not an “ordered set” aggregate:

SELECT name, count(*) WITHIN GROUP (ORDER BY num) from nums GROUP BY name;
ERROR:  count is not an ordered-set aggregate, so it cannot have WITHIN GROUP
LINE 1: SELECT name, count(*) WITHIN GROUP (ORDER BY num) from nums ...

It barfs because I tried to provide a WITHIN GROUP clause to a non ordered set aggregate count. If I remove my WITHIN GROUP clause, I get:

SELECT name, count(*) from nums GROUP BY name;
  name  | count
 alex   |     6
 justin |     3
(2 rows)

So what is an ordered set aggregate? It just means an aggregate fn whereby performing the computation requires some kind of ordering. My hunch is that ORDER BY inside of WITHIN GROUP isn’t used just for “sorting” the rows, but for defining the key by which they’ll be grouped. Or that’s just my bullshit way of fitting it into my brain.

SELECT count(*) as unfiltered, count(*) FILTER (WHERE i < 5) as filtered FROM generate_series(1,10) AS s(i);
 unfiltered | filtered
         10 |        4

So why count(*) and not count()? Because count is an aggregate function I guess, and that’s one why to distinguish it from something, say, version().

Type casts

CAST ( expression AS type )

is the same as


You can cast any string literal:

select '4.123'::real;
(1 row)

Postgres implicitly casts things for you, e.g. assigning values to columns (because the column type is obviously known). In some cases you have to be explicit; Postgres works this way so as to not surprise you with silent type casts.



It means how words/letters/phonemes are stored.


must be rectangular

ERROR:  multidimensional arrays must have array expressions with matching dimensions


CREATE TYPE myrowtype AS (f1 int, f2 text);
SELECT ('1',2)::myrowtype;

Order of expression evaluation

Left-to-right short-circuit eval is not a thing:

SELECT true OR somefunc();

Hence the above probably won’t run somefunc, but not because it’s on the right and the left side evals to true, but because the evaluator’s already gone thrugh and decided that it can do less work by only using the left side.

CASE constructs defeat optimizations by forcing evaluation, so you could write the above as

SELECT CASE WHEN true THEN true ELSE somefunc() END;



That’s pretty awesome. Declarative all the way. Let the evaluator figure out the best way to go.

One gotcha is though is that the evaluator might try and simplify constant subexpressions (subexpressions that don’t depend on any rows being looked up), so something like this:

ERROR:  division by zero

This happens because num is dynamic and the evaluator will evaluate the 1/0 before even running the query, causing the divide by zero error.

Foreign keys

A foreign key must reference columns that either are a primary key or form a unique constraint. This means that the referenced columns always have an index (the one underlying the primary key or unique constraint); so checks on whether a referencing row has a match will be efficient.

Makes sense:

CREATE TABLE lols ( thing_id integer REFERENCES nums(num) );
ERROR:  there is no unique constraint matching given keys for referenced table "nums"

Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too.

Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.

So if an article has many comments, then comments.article_id is an FK, hence article.id must have unique constraint. But if the referenced article is deleted, then postgres needs to scan the comments table for all rows referencing the deleted article to perform some logic on it. So you probably want to index comments.article_id. That said you probably want to index it anyway since you’ll be doing many comment lookups for a given article.

Postgres wiki


This seems surprisingly badass.

Schemas / clusters

The only data shared b/w databases is users and groups.

But schemas aren’t kept so separate; a database can have multiple schemas, multiple schemas can have tables w the same name.

Table Expressions

Think of WHERE, GROUP BY, HAVING and others as modifiers that ultimately produce a table that you can select from. Holy shit that would have made my life so much easier if I knew that.



Note that WHERE filtering happens after ON criteria. Again this is a case of JOIN just being a thing that produces another virtual table and WHERE filters whatever table.


Use them when you can’t use joins:

  • subquery is a grouping / aggregation

VALUES can also be a subquery.


conduct or speech inciting rebellion against an authority or monarch

Discount brokerage


Much cheaper than traditional brokerage but doesn’t offer financial advice.

Open-ended mutual funds

Open-ended in that if you buy into a mutual fund, the shares granted to you are created out of thin air, and the price of the share is based on the total market value divided by the number of shares present.


State of having died but not written a will. Intestacy law deals with divvying up inheritance for folks who didn’t write wills.

Estate tax

Only the largest 0.2% estates in US pay estate tax since it’s exempt up to $5.5 million.

Traditional IRA

  • put in money tax free when you’re in a high tax bracket
  • withdraw money taxed at a (presumably) lower tax bracket when you’re older


Pronunciation dictionary; let’s people upload recordings of pronunciation into a map so that you can see how different regions pronounce things.

Inner Join conditions via WHERE/JOIN

Already know you can do

SELECT a,b WHERE a.wat = b.foo AND a.bleh > 5;

But you can also move the join conditions to the JOIN … ON:

SELECT a INNER JOIN b ON (a.wat = b.foo) WHERE a.bleh > 5;

These are equivalent, whichever you choose is a matter of style.

Note that it only applies to INNER joins: the WHERE version of the above only produces INNER joins; if you want outer, use the OUTER JOIN ON syntax.

The ON or USING clause of an outer join is not equivalent to a WHERE condition, because it results in the addition of rows (for unmatched input rows) as well as the removal of rows in the final result.

Whereas INNER joins are reduced unions of both tables, and hence expressable as WHERE, which is used for filtering out rows. In other words, you can’t add rows to a result set with WHERE; WHERE is only used for filtering, and OUTER joins can add rows to the result set.

The WHERE clause

Typically references at least one column referenced in table generated in the FROM clause.


Grouping without aggregate expressions effectively calculates the set of distinct values in a col- umn. This can also be achieved using the DISTINCT clause (see Section 7.3.3).

Makes sense.

We can actually verify that the postgres evaluator reduces this to the exact same query plan:

diff <(psql -c "EXPLAIN SELECT DISTINCT name FROM foos") \
     <(psql -c "EXPLAIN SELECT * FROM foos GROUP BY name")

Objects instead of .publish().connect()

  firstName: "alex",
  lastName:  "matchneer",

  // using lazy computed properties
  fullName: computed('firstName', 'lastName', function() {
    return `${this.get('firstName') ${this.get('lastName')}`;

  // using lazy computed properties
  fullName: computed(function() {
    return zip('firstName', 'lastName', (first, last) => `${first} ${last}`);

  time: service(),
  completedInWords: computed('completedAt', 'time.sharedTicker', function(completedAt) {
    // hmm this doesn't need to be live
    return moment(this.get('completedAt')).agoInWords();

  liveTimer: liveComputed(function() {
    return this.ref('firstName').flatMapLatest((firstName) => {
      return Observable.interval(500);

  startWeighing() {
    this.set('isWeighing', true);

  isWeighing: false,
  rawWeights: computed('isWeighing', function() {
    if (this.get('isWeighing')) {
      return interval(200).map(() => Math.random());
    } else{
      return null; // or empty observable

  // by subscribing to 'this', we're subscribing to lifetime of object.
  // so we're guaranteed that this gets evaluated immediately upon creation.
  // We want this behavior here
  isStillReceivingData: computed('this', function() {
    return this.ref('rawWeights').flatMapLatest(() => {
      return timer(500).map(_ => false).startWith(true);

  doStuff: computed(function() {
  }).subscribe(), // this means is live for lifetime of object
                  // you know what else is live during lifetime of object?
                  // action handlers...

  doStuff: subscribe('actions.startDoingStuff', function() {

let Timer = Ember.Object.extend({
  hasElapsed: false,
  ms: 0,
  init() {
    this.timerId = Ember.later(() => {

    }, 500);

  hasReceivedDataTimer: computed('foo', function() {
    return Timer.create({ ms: 100 });
  hasReceivedDataRecently: alias('hasReceivedDataTimer.hasElapsed'),

Maybe about to push the envelope


Help Me With Observables

How do I do this with Observables (or CSP or alternatives)

I build/maintain Express Checkout, a hybrid app, which means it’s an Ember.js app running in an embedded browser within an iOS/Android app that occasionally calls into the native app layer to do native-specific things like turn on the camera, scan barcodes, register for push notifications, etc.

I’ve been fascinated with Rx Observables for some time now and have used them to clean up a lot of the code in my app that deals with asynchronous behavior and concurrency. But there’s one part of my app I’ve been avoided refactoring with Observables because I just can’t wrap my head around how to approach it and express it with Observables, hence I’m enlisting any functionally-minded Observables veterans to help me think about how to structure this.

Read on →



If you try a non-blocking read and there’s no data available for you, then EAGAIN fires.


def _read_from_socket(nbytes)

  rescue Errno::EWOULDBLOCK, Errno::EAGAIN
    if IO.select([self], nil, nil, @timeout)
      raise Redis::TimeoutError

rescue EOFError
  raise Errno::ECONNRESET


  1. Try and read in a non-blocking manner
  2. If no data is available the read, IO.select to block for @timeout seconds until there’s something to read. If there’s nothing to read after @timeout seconds, it returns nil and raises a redis error.
  3. If there is something to read, start over and try a non-reading block again, which should succeed.

If it throws TimeoutError then the redis call took longer than the default 5 seconds, probably cuz the server’s overwhelmed.

So, why even bother read_nonblock at first if you’re just going to block on IO.select? Why not just do a blocking read with a timeout? I’m guessing because it doesn’t exist since it can otherwise be expressed with the above structure of 1) try nonblocking read and 2) block with IO.select and retry again. This answer is probably wrong but jesus christ this stuff is nuts.

XCode Build Settings

BOLD means motherfuckin STRING LITERAL as opposed to

Apple iOS plist format


Then you need a link like

<a href="itms-services://?action=download-manifest&url=itms-services://?action=download-manifest&url=http:/oursite.com/myApp.plist" id="text">



Plot Armor


When you’re so essential to the story that obviously you won’t be killed by some fight or some bullet, you have plot armor. Because you’re crucial to the plot.

Point ember components to github zipped archives

e.g. instead of

"ember": "components/ember#fed005fdc4dc3a8f19324a887c1021e8bf19acf4",


"ember": "https://github.com/components/ember/archive/ae3730263f416204e424f884c8444190e5a967dc.zip",

The former will clone the massively large repo of compiled Ember builds, which takes forever even with an insanely fast connection, while the latter just downloads a zip of a snapshot of the directory tree at that particular checkout, which is FAST.

Time Zones

Basically this railscast is amazing.


Ruby Time.now uses system timezone. Verifiable by opening irb and running Time.now many times whilst changing the timezone in system preferences.

Use around_filter with Time.use_zone(current_user.time_zone, &block)

Less mixins vs extend


Mixins copy and paste CSS rules into everyone who mixes it in. Extend just creates multiple selections for the same rules, which almost certainly means smaller output CSS size (but perhaps adds to CSS engine overhead since there are more rules to check against?).

objects vs memoizations

Say you have an array of people:

people = [
    id: 1,
    first_name: "Alex",
    last_name:  "Matchneer",
    follows: [
      { somePersonObjOfSameRecursiveStructure },
    id: 2,
    first_name: "Noel",
    last_name:  "Gallagher",
    follows: [
      { somePersonObjOfSameRecursiveStructure }

What’s the difference between



get(people, 0)

? What’s the difference between



get(get(people, 0), 'first_name')

? What’s the difference between

person # a var with mem address 0x00001234


get(MEMORY, 0x00001234)


It’s all just memoization. In the end, every object has a memory address. Variables are just memoized get(MEMORY, someMemoryAddress).

=== in javascript is just memory address comparisons.

What if you needed to print a credits page, and the same person had multiple roles, e.g. director, producer, actor, and you wanted to print out a formal version of their first and last name. Maybe you’d write a function:

function formalizedName(person) {
  return `M. ${person.first_name} ${person.last_name}`;

But if you’re rendering a page and that name shows up multiple times, you’re wastefully recomputing, concatenating. Let’s assume avoiding recomputation/concatenation would improve performance by some noticeable margin. You could memoize, i.e. store the result

blah blah blah

binding and immutability

  Hello . Your friends are:

if people is an immutable data structure, then any of the following modifications will produce a new immutable value of people:

  • a person’s name changes
  • a person’s list of friends changes
  • a person’s friend’s name changes
  • etc

But we’re using “, which means we’re creating bindings (internally we’re creating keystreams, and those create bindings according to the ember object model). This means we’ll be creating meta objects on each immutable data pojo… which is wasteful and useless considering they’re immutable, and their properties can’t can’t change.

So why not use unbound helper within each curly? It’d save us an observer, right? It’d save us writing to meta, right? Sure, but it also means that it won’t update the second time around, because it’s unbound.

So basically, “ does two things:

1) Sets up a keystream 2) Sets up a binding

TL;DR to get immutable structures bindable in a performant manner in Ember, we need to make it possible to opt into a different KeyStream constructor:

  1. it doesn’t call add/removeObserver (wasteful since it’ll never fire)
  2. don’t assume that just because changed, that changed; with immutable data, it’s pretty likely a.prop is actually the same assuming that’s not one of the properties that was changed between the old immutable value and the new immutable value.

Lazy Observables / Ember Streams

Unlike push-only observables, Ember streams are push/pull. You push that something has changed by notify(), and then later on you pull with value(), which only runs through computations once.

This implies laziness. In Ember, the laziness of an LO lasts from the first .set that changes a watched value until the render run loop queue.

For similar reasons, this is why computed properties don’t work with observers without an explicit get to eagerly flush a CP.

But anyway, what are the tradeoffs between LO and push-only Observables?

Well, one is that, if you funnel an O into an LO, then you’re discarding a bunch of onNexts until some arbitary pull into the future (yet you only end up reading the most recent “event”).

Here’s a stab at lazy observables: http://jsbin.com/polude/6/edit?html,js,console,output


Android GUI Architecture

Single-threaded, event-driven, nestable components, much like:

  • AWT
    • Java’s original cross-platform UI widget toolkit
  • Swing
    • richer widget set
    • draws its own widgets rather than using host OS’s user interface widgets
  • SWT
    • alternative to AWT/Swing, heavy use/development by Eclipse
  • LWUIT (Lightweight User Interface Toolkit)
    • for Java ME (micro edition, mobile phones, etc)
  • others

So what UI library / environment doesn’t have a single UI thread? It doesn’t seem like there is one. UI data-structures are so fragile and coupled that you’d need to mutex the hell out of them any way; easier to just have a single UI thread.


Android Event Loop

  1. User touches the screen
  2. Android system enqueues action on event queue
  3. UI thread dequeues event, dispatches to handler
  4. Tell the Model that state has changed
  5. Model notifies UI framework that some portion of display is stale (which is just another action enqueued to the same event queue)
  6. redraw event removed from queue, dispatched to a View, tree of views is redrawn

Specific example:

  1. User taps screen, framework enqueues MotionEvent
  2. MotionEvent is dequeued, framework dispatches to the first view within the bounding box of where tap happened
  3. Button handler tells model to resume playing a song
  4. Model starts playing song, enqueues redraw request
  5. redraw request dequeued, redraw occurs

A Button therefore acts like both a Controller and a View; it handles tap events and updates a model, and then gets redrawn accordingly to reflect updated state.

Never update display within a controller handler; just issue redraw requests. Aside from separating concerns, this lets multiple redraw events essentially coalesce into one, after ALL changes caused by the handler have been made.

Single-threaded-ness means:

  • no synchronize blocks b/w View and Controller; just enqueue and the single threaded UI looper will pop. QUESTION: do you need to synchronize pushing to the queue? What if multiple threads are pushing to the queue? ANSWER: the queue is managed by the Hander class, which is bound to a specific Looper and thread. You post to the Handler, and Handler post methods are threadsafe.
  • it’s easy to completely block/stall your application if you’re doing something long/slow/expensive; move that logic to some other thread

What’s a widget?

Leaves in the view tree, basically.

Tunneling to Redis from the browser

Was reading a Heroku thing about not abusing tunnels via websockets, so I figured I’d connect to Redis from the browser because why not.

First off

npm install -g wstunnel

Then in somefile.html

<input type="text" id="textInput"/>

<pre id="messages"></pre>

<script type="text/javascript" charset="utf-8">
  var redisSocket = new WebSocket("ws://localhost:8080", ["tunnel-protocol"]);

  redisSocket.onmessage = function(event) {
    // need FileReader to convert from blob to text
    var reader = new window.FileReader();
    reader.onloadend = function() {
      messages.innerHTML += reader.result;

  textInput.addEventListener('keypress', function(e) {
    if (e.keyCode !== 13) { return; }
    redisSocket.send(new Blob([textInput.value + "\n"]));
    textInput.value = "";
  }, false);

Then you can type in raw redis commands and get raw redis responses. Pretty cool.


Tunnel public URL to your localhost server. Useful for:

  • testing webhooks
  • testing apps that don’t have access to localhost, etc

Localytics (and analytics terminology)


The quantification of how a given ad impression influences a user conversion rates. Use Attribution to find out which ad campaigns seem to be the most effective. I guess you could also say use Attribution to figure out which entry points into an app most often lead to conversion.

  • sessionTimeoutSeconds
    • time after close() that the session is actually considered closed.
    • if open is called within the timeout… on the same localyticsRequest object.

Google Analytics events



Over-arching string name for a category of events.

  • You need to decide ahead of time whether you care to distinguish between “Videos - Cats” and “Videos - Dogs” or whether you just want them grouped under “Videos”
  • You’re screwed if you push a version of your code sending a category of “Video” and later change to “Videos”; your historical data will remain there as “Video” (this is probably true for all Event fields)


The thing being done, the name of the event. If you’re category is “Videos”, you might have actions named:

  • Play
  • Stop
  • Pause


  • “All actions are listed independently from their parent categories”. This means if you have re-use the event name “Play” between parent categories “Videos” and “Songs”, they’ll all be munged together, and it’s only when you do a breakdown of “Play” events that their differing parent categories will show up. But you probably don’t want to have something so general as a “click” event across a ton of different categories.
  • “A unique event is determined by a unique action name”. Oh ok this explains the above a bit “You can use duplicate event names across categories”.

document.readyState and friends

  • DOMContentLoaded
    • DOM and synchronous scripts (the default) have been loaded
    • does NOT wait for stylesheets, images, subframes, etc
    • scripts can be made async to not interfere w this loading process
    • UNSURE: can listen on window or document
  • load event
    • all subresources (images, stylesheets, subframes) have loaded
    • only fires on window

But you can also ask document.readyState where you are in the process:


  • loading
    • The document is still loading.
  • interactive
    • The document has finished loading and the document has been parsed but sub-resources such as images, stylesheets and frames are still loading. The state indicates that the DOMContentLoaded event has been fired.
  • complete
    • The document and all sub-resources have finished loading. The state indicates that the load event has been fired.

Add document.readyState to your watches and run:

<!DOCTYPE html>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">

  <script type="text/javascript" charset="utf-8">


    window.addEventListener("load", function(e) {
      //alert("i am loaded");
    }, false);

    window.addEventListener("DOMContentLoaded", function(e) {
    }, false);


window.onerror and CORS


Script error on line 0.

This almost certainly means a script loaded via CORS fired an error. Modern browsers zero-out the data


The above 2006 link demonstrates how the errors produced by a non CORS remote script can be used to sniff out which sites you’re logged into, and hence modern browsers strip all information from external non CORS scripts to just say line 0 and “Script Error.”. This thwarts bugsnag unless the script you load is CORS.

Ruby Exception#cause

If one caught exception causes another to be raised, Ruby keeps track of all preceding exceptions within a chain of Exception#cause. Bugsnag uses this to great effect.

    raise "wat"
  rescue => e
rescue => e
  puts e.cause.backtrace # print the backtrace of RuntimeError "wat"


SPF 10000

SPF: Sender Policy Framework

Prevents sender spoofing: server receiving STMP mail can check the sender’s IP against SPF DNS records. SPF records are stored in both SPF and TXT records and have a format like:

"v=spf1 a mx ip4: ip4: ip4: ip4: -all"

Question: what if the sender sends the wrong IP? Answer: then the server wouldn’t be able to communicate back to the sender (SMTP operates over TCP, so you have the handshake and connection state preventing IP spoofing).

Google apps for businesses makes you add an SPF record to your domain so that it can send e-mail on your behalf and not have recipient servers block it.

SRV records: share the location of services via DNS



_sip._tcp.example.com. 86400 IN SRV 10 60 5060 bigbox.example.com.

Google apps for businesses also use this for XMPP service location. SRV records have a priority in a weight. Clients must use the lowest priority services, and then if there are multiple services in that priority, randomly select using the provided weights.

Ember tests

import hbs from 'htmlbars-inline-precompile';
import { moduleForComponent, test } from 'ember-qunit';

moduleForComponent('my-component', {
  integration: true

test('block params work', function(assert) {

      This happened  days ago.


  this.set('theDate', new Date(2015, 2, 11));
  assert.equal(this.$().text().trim(), "This happened 123 days ago.");

In recent versions, integration:true is the default.

Checked in:


Nice example:


Aaaaand a nice blag!


Vim: save and run

From DAS: easy enough to just write a quick map on the fly:

map ,t :w\|!ruby %<cr>

JS Regex: multiline

Use /m option and your ^ and $ will match beginnings/ends of lines rather than beginnings and ends of the entire string. :)

Hyperthreading, Physical vs Logical Cores

machty.github.com :: sysctl hw.physicalcpu
hw.physicalcpu: 4
machty.github.com :: sysctl hw.logicalcpu
hw.logicalcpu: 8

Feature of Intel Core i5 and i7 (probably others too). Allows, in certain cases, multiple executions to be run on a single core, squeezing out more power out of a single core. Allows multiple threads to run on the same core. Gives a 20% performance boost in a lot of cases (rather than 100% boost of a full on new core).

Ember boot


  1. Ember loads.
  2. You create an Ember.Application instance global (e.g. App).
  3. At this point, none of your classes have been loaded yet.
  4. As your JavaScript file is evaluated, you register classes on the application (e.g. App.MyController = Ember.Controller.extend(…);)
  5. Ember waits for DOM ready to ensure that all of your JavaScript included via <script> tags has loaded.
  6. Initializers are run.
  7. If you need to lazily load code or wait for additional setup, you can call deferReadiness().
  8. Once everything is loaded, you can call advanceReadiness().
  9. At this point, we say that the Application is ready; in other words, we have told Ember that all of the classes (components, routes, controllers, etc.) that make up the app are loaded.
  10. A new instance of the application is created, and instance initializers are run.
  11. Routing starts and the UI is rendered to the screen.

Brew Terminology


Formula The package definition /usr/local/Library/Formula/foo.rb Keg The installation prefix of a Formula /usr/local/Cellar/foo/0.1 opt prefix A symlink to the active version of a keg /usr/local/opt/foo Cellar All kegs are installed here /usr/local/Cellar Tap An optional repository (git) of Formulae /usr/local/Library/Taps Bottle Pre-built (binary) Keg that can be unpacked qt-4.8.4.mountain_lion.bottle.1.tar.gz

Slack: Shift Escape

  • shift-esc marks all channels as read

Ember fix force push?

Here’s a git fetch

From github.com:emberjs/ember.js
 + 61c9ba6...c3f15cf master     -> origin/master  (forced update)
   4aab5ad..d1a1a7c  beta       -> origin/beta
 + 5a084f7...e7866ca in-template-config -> origin/in-template-config  (forced update)
 + 172002f...642f5c3 remove-bind-attr -> origin/remove-bind-attr  (forced update)
   c3accfb..1ad89cf  stable     -> origin/stable

vim-rails and other shit i should already know

Come on this has been out forever how do you not know this.

  • ctrl-O back
  • ctrl-I forward
  • gf: go to file of hovered-over class



Instructions for how to remove yourself from various services, social media, etc.

Java final and immutable objects

Since Java strings are immutable, the String class must be declared final. Otherwise, someone could just subclass String and mutate it, breaking the immutable guarantees of the parent String class.

Java @Override annotation

Not strictly required, but hints to the compiler what you’re trying to do, and errors out if you fail to correctly override a parent class’s method.

Java: Checked vs Unchecked Exceptions

Checked: required in throws clause.

Unchecked: not required in throws clause, must extend RuntimeException.

This might be wrong but checked exceptions kinda just feel like they’re just part of the type signature, e.g. “this is a method that returns a Result, IOException, ParseException, or SomeOtherThing”. In functional land it seems like it’d be really easy to switch based on the result.

Android: Dalvik


Android compiles your java down to JVM bytecode .class files, and then the dexer compiles .class files down to Dalvik bytecode. Androids don’t have JVMs. They have DVMs. Actually, they did until about Android 4.4. Now they’re replaced by Android Runtime.

Android Activity

Activity is UI + execution. It’s a component, I guess.

Activities invoke each other with Intents. Several activities might be registered for a given Intent. An application is a bundle of activities. Activities don’t directly call code on other activities; rather, intents are used. Don’t hold on to references to Activities; they’re meant to be aggressively GC’d.

A Task is a chain of user interactions that might span multiple activities (sometimes apps), e.g. going to Messaging, looking up a Contact, and calling that contact (3 separate activities from 3 different apps).

A Service is a background task, e.g. music player, or any kind of server waiting for a client interaction. Android avoids reclaiming services and keeps them alive unless extreme memory pressure.

Android multi-user

Android runs on Linux, and each application vendor gets its own user and group, and all applications installed are run under that user and group. So basically applications can’t access other applications’ data, unless they’re from the same application vendor (as determined by the keystore that signed the release apk?).



"foo = %{foo}" % { :foo => 'bar' }        #=> "foo = bar"

Base64 from the shell

echo "wat" | openssl enc -base64 -A

From http://www.w3.org/TR/SRI/#goals

echo -n "alert('Hello, world.');" | openssl dgst -sha256 -binary | openssl enc -base64 -A

SubscribeOn vs ObserveOn



Nils nils nils

Yes, use .fetch() over [] if you’re working with an options hash where all fields are expected/required so that you don’t accidentally leak nils. But .fetch(:wat, nil) is the most pointless thing of all time. It adds no value over [:wat]. So stop thinking about it, Matchnozzle!

JSON Pointer

RFC: https://tools.ietf.org/html/rfc6901

TL;DR defines how to reference values within a JSON doc in a standardized way, including via normal URIs. So potentially you could cite a value in an API request from Wikipedia I guess.

Also used in JSON Patch to describe the path to changed things.

Persistent Data Structure


Preserves a previous version of itself when modified; often intertwined with “immutable” since in some languages/libraries like Clojure data structures internally share structure with others between operations.


Lispy language that compiles down to bytecode run on a proprietary runtime.

Excellent article: http://programming-puzzler.blogspot.com/2010/08/racket-vs-clojure.html

In short, Clojure wins because of data structures (but is still annoyingly limited by Java ties; no tail recursion, poor performance w numbers sometimes, etc).

Bottom line

Literally the bottom line of a statement, where the total is calculated. That’s what the saying’s referring to when folk say how’s this affect our bottom line.



wtf is a brooch.


Decorative jewelry that can be attached to garments, often to hold them closed. Could be collars to shirt, could be holding together a robe-ish thing. Brooch brooch brooch. Remember that shit.

Android HW Accel

There’s hardware acceleration in the Android rendering pipeline since Honeycomb, Android 3.0, which came out Feb 2011. By default, the manifest attr that controls this is set to false:

<application ... android:hardwareAccelerated="false">

So you’d have to set that to true to enable hw accel globally. Then there’s finer granularity for window and View via setFlags and setLayerType.

Amazon Local

What it is? Google autocomplete search elucidates:

“amazon local vs”

  • groupon
  • livingsocial
  • square

Seems to promote your business via groupon-esque deals.

Java Anonymous Classes

Weird syntax I didn’t recognize. Basically lets you instantiate an anonymous class assuming you don’t need to share the class. The cool thing about this is that you can implement/instantiate an instance of a class based entirely on an Interface (this was weird for me since Interfaces in Java seem like these ephemeral ghostly non-existent things that normally take a lot of verbose code to instantiate).


public class HelloWorldAnonymousClasses {

    interface HelloWorld {
        public void greet();
        public void greetSomeone(String someone);

    public void sayHello() {
        // this a function body!

        // this is a "local" class... which means you define it,
        // and...
        class EnglishGreeting implements HelloWorld {
            String name = "world";
            public void greet() {
            public void greetSomeone(String someone) {
                name = someone;
                System.out.println("Hello " + name);

        // ... THEN you instantiate it
        HelloWorld englishGreeting = new EnglishGreeting();

        // This is an anonymous class; define it at the same time
        // you instantiate it.
        HelloWorld frenchGreeting = new HelloWorld() {
            String name = "tout le monde";
            public void greet() {
                greetSomeone("tout le monde");
            public void greetSomeone(String someone) {
                name = someone;
                System.out.println("Salut " + name);

Wadsworth Constant


The Wadsworth Constant is the fundamental idea that the true meaning of a video, conversation, or comment approaches importance after approximately 30% of it has been skipped over.



A feature of SSH that multiplexes ssh sessions over one TCP connection, cuts down TCP connection overhead, etc. Ansible benefits from it.

Host-based authentication

hence pg_hba.conf.

binstubs and gem

here’s a binstub

require 'rubygems'
gem 'bundler'
load Gem.bin_path('bundler', 'bundle')

binstubs prep $LOAD_PATH ($: is the same thing) before a gem is run.

The gem 'bundler' command prepends LOAD_PATH with the specified gem’s load paths. Additional calls to gem append after the load paths of earlier gems.


scans your logfiles, looks for malicious activity, bans bad guys for a while.

Digital Ocean droplets have public IPs, ec2s are priv

ec2 does expose public url but it has to go through a firewall defined by your security group. Note that this is obvious because, within an ec2 instance, its IP is in one of the private IP address ranges (10.something, 192.168.something, and i think 172.something). So can you use a private IP publicly?


You could, but any network admin/ISP is going to block egress and ingress packets sourced/destined for any of these ranges. This prevents IP spoofing among other things, e.g. you can’t forge a fake packet and expect it to be routed to some internal server, which might read and respond to the packet and cause damage… all of this is avoided by internal servers using private IPs.

Ansible: modules vs playbooks vs rules

  • playbook
    • list of “plays”
  • play
    • map a group of hosts to some well-defined roles
  • roles

  • tasks
    • call to an ansible module (which can directly be done via something like ansible somehost -m ping
    • tasks are performed one at a time (though they probably branch out simultaneously on multiple hosts)
  • module
    • can be executed directly via ansible or via playbooks
    • a specific command, can make changes to a variety of server types
    • e.g.
      • ansible webservers -m service -a “name=httpd state=started”
      • ansible webservers -m ping
      • ansible webservers -m command -a “/sbin/reboot -t now”

Ansible tags


Use tags to run subsets of a playbook.

  • tag certain tasks
  • specify at run time a tag of which tasks to run
  • you can also tag includes
    • such that if you specify tags to ansible-playbook, it won’t include those rules unless tagged

Sudo password is yours, not roots!!!!


When you execute sudo command, the system prompts you for your current user account’s password before running command as the root user. By default, Ubuntu remembers the password for fifteen minutes and won’t ask for a password again until the fifteen minutes are up.

This is why you have a sudoers file! You define who’s allowed to run as root user and let them run as sudo.

su -c 'some command' on the other hand asks for root’s password.

Debian Hosts file


The IP address in the second line of this example may not be found on some other Unix-like systems. The Debian Installer creates this entry for a system without a permanent IP address as a workaround for some software (e.g., GNOME) as documented in the bug #719621.

The matches the hostname defined in the “/etc/hostname”.

For a system with a permanent IP address, that permanent IP address should be used here instead of

For a system with a permanent IP address and a fully qualified domain name (FQDN) provided by the Domain Name System (DNS), that canonical . should be used instead of just .


e.g. localhost; it’s a way to access a computer’s own network services via a network interface. I guess I knew this, but I just didn’t think about how it unifies the interface… e.g. whether it’s local or remote, just use IP all the time.


“interface config”

debops: run site.yml first

Don’t skip this step!

I used rails_deploy before site.yml and things were just ambiguously missing and I had to patch them up.


It’s where all the encrypted passwords live. /etc/passwd is all the user names, but lots of applications need access to it, so it can’t be plaintext. SO in passwd you put ‘x’ for password and that causes /etc/shadow to be looked up.

Ember Streams

Ember Streams

My goal is to tease out the differences between Ember Streams and Rx.


A stream has subscribers.


Wraps a single subscription. Broadcasts events from this single subscription to all subscribers.

Rx Disposable

Anything with a .dispose method. So, what’s disposable in Rx?

  • subscriptions
  • observers

What’s it used for?

  • cleaning up some resource after subscriptions no longer need it
  • generic disposal of no-longer needed objects
    • e.g. internal (non-detach) observers are disposable, but not really for the purpose of cleaning up some resource, but just making sure that no more events make it through

There isn’t a Subscription object in Rx, but you can think of the return value of observable.subscribe to be a “subscription”. Technically it’s an “auto-detach” observer; an auto-detach observer proxies to an underlying observer. When an auto-detach observer is disposed of, it disposed of the underyling subscribed observables… except that an observable isn’t a disposable; observables return disposables from their subscribe methods.

Another way of thinking about this is that, on their own, observables don’t act/exist/do anything until they’re brought to life by a subscription. There’s nothing to “dispose” of until there is a subscription. Once someone subscribes, then you have a bunch of stuff that needs to be cleaned up at some point, hence disposables.

Disposables have their use beyond Rx; dispose is kind of like a destructor; JavaScript doesn’t have destructors. Destructors don’t make as much sense in garbage-collected languages because their timing is non-deterministic (often it ends up being more practical to manually manage this yourself).

Ember destroy

Ember’s got “destroy”-ables… destroy methods on objects, and isDestroyed properties that get set and checked, assertions if things are called on destroyed objects, etc.

Ember Object destroy

  • schedules call to willDestroy hook, meant for subclasses to implement
  • aggressively tears down meta object
    • destroys bindings/observers
      • hence anyone listening to events on the obj or binding to values won’t get any more updates.
    • enables eager GC of metadata (it’s pretty easy in JS/any dynamic language to keep around references to stuff you don’t care about any more; at least with destroy you can eagerly remove things)

EventDispatcher destroy

Removes all dispatcher-added jQuery event listeners. Calls super.

Collection View

Calls super, removes array observers (which live on the content array and delegate to the collection view), destroy empty view.

Core View

First off, CoreView is “deprecated” in that it shouldn’t be used directly, but Ember.View still extends it.

Calls super, destroys the DOM el (?), some other crap.

Probably doesn’t make sense to write about until Glimmer.

Rx Observers

Class hierarchy

  • Observer
    • AbstractObserver
      • AnonymousObserver

You can fulfill the Observer contract without using an Rx Observer class:



var onNext      = () => { console.log('next'); };
var onError     = () => { console.log('error'); };
var onCompleted = () => { console.log('completed'); };

var pojo = { onNext, onError, onCompleted };

var legitObserver = Rx.Observer.create(onNext, onError, onCompleted);
// same thing:
// var legitObserver = new Rx.AnonymousObserver(onNext, onError, onCompleted);



The difference is that legitObserver’s hooks only fire for the first subscription, and nothing fires for the second; the pojo on the other hand runs through the range twice. Why? Because Rx.Observer.create creates an AnonymousObserver, which is an AbstractObserver, and AbstractObserver sets an isStopped=true flag to prevent further events from coming through. In other words, generally speaking an observer is only meant to be attached to one subscription, but if you want the same object to receive events from multiple observables (and for some stupid reason you don’t want to call Observable.prototype.merge then you can just pass subscribe some pojo with the necessary onNext, onError, onCompleted methods defined.

Of course you can subscribe the same handler fns and internally two separate AnonymousObservers will get created, so events from both subscriptions will fire:

Rx.Observable.range(1,5).subscribe(onNext, onError, onCompleted);
Rx.Observable.range(1,5).subscribe(onNext, onError, onCompleted);

JS _super pattern

Seems obvious, but a decently nice pattern for calling a superclass’s method, taken from Ember’s streams:

merge(SomeSubclass.prototype, {
  /* ... */

  _super$destroy: SomeSuperclass.prototype.destroy,

  destroy() {

    // do subclass-specific stuff

Routeable components / attrs / query params

export default Ember.Route.extend({
  queryParams: {
    page: {
      default: 1,
      // refresh: true,
      // API: infer/generate action name based on param?

  model(params) {
    // params has all params, including QPs

  attrs() {
    return {
      model: this.model(),
      updatePage: this.actions.updatePage

  actions: {
    updatePage(newPage) {
      // this is a manual implementation of the inferred action
      // based on page QP

      // default implementation:
      // when refresh: false (default)
      this.component.set('page', newPage);

      //this.component.set('page', newPage);

// articles template

// articles component
export default Ember.Component.extend({
  // implicit attrs

  //attrTypes: {
  //  page: number,

  page: null,

  // this is called initial render and prop updates
  // (willReceiveAttrs called re-render only)
  willRender(attrs) {
    this.set('page', attrs.page);

Lessons Learned:

  • Route passes in read-only (non-mut) attrs to routeable component
  • This prevents type-writer query params

    // my-input component export default Ember.Component.extend({

    actions: {
      doSomethingThatChangesValue() {
        // TODO: mut api for changing thing


bashrc and bash_profile

ALWAYS forget the difference between these things. One of them is for login shells, one is both. Blurg.


  • .bashrc is read for interactive, non-login shells
  • .bash_profile is read for login shells
  • Mac OS X uses login shells for its terminals, iTerm included
  • hence, might make sense to just source .bashrc from .bash_profile

Also, the meaning of rc isn’t totally known; it could be “run commands” or “runtime configuration” but no one really agrees.

Glimmer Streams

KeyStream takes a source obj stream and a path and streams property values based on the provided key. Can be generated via sourceStream.get(‘wat’). A source stream is just a stream of objects. KeyStreams stream property changes on objects. Changing the underlying object of the source stream will fire a change event on the KeyStream. KeyStreams watch for changes using Ember Observers (addObserver/removeObserver)

KeyStreams (among others) have a setSource that changes the underyling stream of objects; calling setSource will always cause a notify().

Why? mmun says:

in order to not notify we would have to eagerly compute the stream value and compare to the last value

So it’s a tradeoff between minimizing notify spam and losing value laziness.

TODO: how do views use baseContext?

ContextStream is a

Bash Completion


  • works w Env vars, e.g. $BASH_[tab]
  • use complete to specify rules for a command, e.g. match filenames, filter by this regex, etc
  • use compgen to pass shit to a filter fn written in bash (prefixed by underscore by convention)

why nom/bom/nombom

Most ember devs have to do something like this due to NPM fidgetries:

alias nombom='npm cache clear && bower cache clean && rm -rf node_modules bower_components && npm install && bower install'
alias nom='npm cache clear && rm -rf node_modules && npm install'
alias bom='bower cache clean && rm -rf bower_components && bower install'

Why is this necessary? Because:

  1. Even though npm will take into consideration your project’s dependency versions when choosing the version of dependencies of dependencies, once that package has been installed in node_modules, even if you bump your project’s dependencies, that old version will be cached in deep nestings of node_modules, hence it’s safest to nuke node_modules. Afaik only explains the rm -rf node_modules side of things.
  2. NPM won’t install a newer version of a dependency if a matching one exists in the NPM cache (~/.npm/...). Wait, isn’t this desirable? Shouldn’t this be a cue to bump your dependency version?

This seems to be the chief reason why nombom is the only way to sane dependency installation:

  • NPM caches modules by their version, e.g. fstream@1.0.4 (these gets stored in ~/.npm/fstream/1.0.4/package.tgz

Bash + vim (or whatever)

  1. set -o vi so that bash is in Vim mode
  2. Type out a command
  3. Press escape (to leave “insert mode”)
  4. press v to open vim w the current command
  5. Save and quit to execute the command



Generating a (potentially infinite) vector from a scalar. Also known as an unfold.

Actually this is better:


  • Ana(morphism) T –> IObservable
  • Cata(morphism) IObservable –> T
  • Bind IObservable –> IObservable

Hot observables, connect vs refCount vs singleInstance


I haven’t done singleInstance yet, but basically it’s like publish().refCount() except it’ll resubscribe if refcount goes to 0 and then back to 1.

The ultimate nom nuance

npm will assemble/download all project dependencies with the following rule:

  • dependencies are installed only into the rootmost dependencies that specifies that dependency, e.g.


  - foo
  - bar
    - baz
      - foo

The npm will NOT install a baz/node_modules/foo but will rather install a single foo at the root node_modules; the reason this works in Node land is that require always starts at the current dir and traverses upward.

Shitty example: start with

  - bar
    - baz
      - foo

then install at top level foo

  - foo
  - bar
    - baz
      - foo

npm will not remove foo from baz… you wind up with two different versions of foo, shadowing each other. Holy shit!!!!!

NPM dedupe would remove the second foo. NPM 3 will probably call dedupe automatically when you do npm install. But dedupe has its own issues, not the least of which is that no one knows about it and it’s one more thing to tell your team about.


npm delicately stitches things together such that Node’s folder-bubbling require resolution semantics can find the packages installed by npm. It doesn’t override $LOAD_PATH like Bundler does or anything like that.

NPM cherry-pick

git cherry-pick -x f2c270e8d76e81a1921bbc31777aa3ac570ca87a

This is how I pulled in a change on an already-merged idempotent-rerender.


What a horrible name!

Basically, it’s concat that automatically recovers from errors.



Just the title of this section will send a shudder down the spines of old VB developers! In Rx, there is an extension method called OnErrorResumeNext that has similar semantics to the VB keywords/statement that share the same name.

Just as the OnErrorResumeNext keyword warranted mindful use in VB, so should it be used with caution in Rx. It will swallow exceptions quietly and can leave your program in an unknown state.

Observable#finally invoked even on dispose


ConnectableObservables can reconnect


var c = Rx.Observable.interval(500).publish();

c.subscribe((v) => {

// returns a disposable that disconnects.
var s = c.connect();

setTimeout(() => {
  s = c.connect();
}, 1200);


Didn’t understand this til I understood publishLast() and replay(), which basically apply multicast functionality through async subject and replay subject, respectively. Basically replay and async subjects just define different forms of caching, and if you want to share that functionality with connectable observable then you wanna use multicast.

.Publish() = .Multicast(new Subject<T>)
.PublishLast() = .Multicast(new AsyncSubject<T>)
.Replay() = .Multicast(new ReplaySubject<T>)


I’ve talked about before, grokking it better now:


var o = Rx.Observable.range(5, 20);

// HOT 
var shared = o.publish().refCount();

var openings = shared;//.filter((_, idx) => idx % 3 === 0);

shared.window(openings, () => 
  shared.filter((v, idx) => v % 5 === 0)
).selectMany((obs) => {
  return obs.toArray();
}).subscribe((arr) => {


  • closing selector gets passed the value emitted by windowOpenings that caused it to be opened in the first place.
  • if the closing selector is based on the same source data stream, you almost certainly want to use publish().refCount(), or at least make sure the source stream is hot, or if it’s cold it doesn’t expensively re-create / duplicate some underlying resource

Subscribing inside Observer.create()

The contract for Observer.create is to

  • fire onNext*(onComplete|OnError)? on the observer passed in
  • return a disposable

Based on that, the following is a perfectly valid way to alias an observable:


var proxy = Rx.Observable.create((o) => {
  return Rx.Observable.interval(200).subscribe(o);

proxy.subscribe((v) => {

proxy behaves just like interval.

C# += and -= event subscription syntax

this.Click += (s,e) => { MessageBox.Show(

Either<LeftT, RightT>

In Rxx, Either is used a bunch to imply a broadcast of data from either source sequence A or source sequence B; one nice use case of it is how their Retry method works.


Basically, vanilla Rx retry disposes of the error that causes the retry to happen, which means it’s trick to do logging in a nice composable way.

Apple IP is 17.***.***.***



The entire address block is assigned to Apple

So you can check your logs to see if Apple’s snooping your shit.


It is present in ALL .ipas generated by XCode including App Store builds, but by the time you download from the App Store, it has been stripped out.

RxJava lifts, other things


  • hot sequences have no subscription side effects
  • cold sequences may have subscription side effects

  • Observable.just(“1”)

    • cold, because generates that string every subscription
  • Observable.interval()
    • cold, no duh
  • ReplaySubject
    • cold???
    • hot if you’re first subscriber
    • cold if there are items to replay, and then hot thereafter

Redis: why use hash?

Why do

HSET somehash key val

when you can do HGET

redistogo disables CONFIG command

1:39 PM <machty> any idea why on redis 2.8.11 i'm getting ERR unknown command 'config' ? docs say config command exists since 2.0.0
1:48 PM <machty> actually i think it's because CONFIG is disabled by redistogo service... unfortunately both CONFIG GET and CONFIG SET :/
1:55 PM <xxxx> machty: I actually work for Redistogo
1:55 PM <xxxx> We do disable it indeed
1:56 PM <machty> xxxx: ah, i guess there's no way to dynamically get maxmemory? was thinking of using it for alerts
1:56 PM <xxxx> Unfortunately not, there was talk of working on API for it but that development kind of got halted

Apparently a lot of effort is being shifted from redistogo to ObjectRocket redis, which:

  • doesn’t disable CONFIG
  • has high availability (HA) enabled by default via sentinels
  • http://redis.io/topics/sentinel


What is it?



  • nagios: infrastructure monitoring
  • log = timestamp + data
  • Logstash
    • open source
    • config file is nginx-ish
    • output graphs and a bunch of other things
    • use grok
      • write patterns, and give them name
      • reusable regex
      • comes w 100 patterns
      • no need for regex skills
    • data filter included to handle all varieties of timestamp format
    • stop inventing time formats
    • multiline filter for errors w stack trace
    • gettimeofday
    • NTP
      • network time protocol, used for syncing servers
      • if apache uses gettimeofday
    • feature
      • transport and process logs to and from anywhere
        • get them in analyzable format
      • provide search and analytics
    • community
      • kibana: web interface for logstache
      • logstash-cli: search / analytics from commandline
    • dreamhost deployment
      • 20k apache events/sec peak
      • 250 mil events/day
      • 75gb day
      • 160 web servers
      • 7 logstach / elasticsearch servers



  • elastic search
    • searchable database
  • logstash
    • parses/processes logs from many sources, stores in centralized location
  • kibana


  • installation
    • download, untar, and run

So who the fuck uses ELK stack? Just enormous companies that manage their own custom infrastructures?

Answer: anyone who does enough devops to

  • be smart enough to set it up
  • know the pain of living without it

Logstash is generally meant to live on the machine that’s producing the logs, and then it can forward it on to an elasticsearch cluster..

Ruby logging

A flexible logging library for use in Ruby programs based on the design of Java’s log4j library.