Who am I?

Santiago Saavedra

Who is OpenShine?

  • Small consultancy firm
  • Based in Madrid, Spain
  • Remote friendly

1. Not affiliated to DisplayLink Inc.

2. GPUL is a non-profit Association in Spain with over 500 members for fostering free software.

3. Did a talk yesterday in the Privacy devroom

What's the problem?

Elasticsearch queries…

Elasticsearch queries…

{
  "aggs": {
    "by_month": {
      "date_histogram": {
	"field": "date",
	"interval": "month"
      },
      "aggs": {
	"headquarters": {
	  "terms": {
	    "field": "headquarters"
	  }
	},
	"aggs": {
	  "headcount": {
	    "cardinality": "userId"
	  },
	  "aggs": { /* ... */ }
	}
      }
    }
  }
}

What's the problem?

How much does an Elasticsearch query impact cluster performance?

  • We have triggers
  • The Kibana Profiler (not on 5.1!)
  • We can do load testing
  • What about user-generated queries?
  • Could we know before running it?

What is static analysis?

Answer questions about code without running it.

  • Terminates?
  • Max memory needed?
  • Output for a given input? (Restrictions)
  • Can this pointer be null?
  • Is this variable always initialized?
  • Is this dead code?
  • Does this expression have the correct type?

Static optimization

Not only answering questions, but also modifications.

E.g., -O3 in the GNU Compiler Collection.

  • Pipeline assets reuse
  • Constant extraction
  • Type inference
  • Query optimization
  • Code rearrangement
  • Cost analysis
  • Cost optimization

Program correctness

  • Curry-Howard-Lambek correspondence
  • Proofs are types are proofs.
  • Richer type systems
  • Better compile-time error reporting
  • Model checking

Static cost analysis

E.g.

E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. 2007. Cost analysis of java bytecode. In Proceedings of the 16th European Symposium on Programming (ESOP'07), Rocco De Nicola (Ed.). Springer-Verlag, Berlin, Heidelberg, 157-172.

Cost analysis?

GCC -march -mtune

-march

GCC uses name to determine what kind of instructions it can emit when generating assembly code.

Restricts the set of instructions.

-mtune

Specify the name of the target processor for which GCC should tune the performance of the code.

Choose the best instructions for a given set of operations.

CPU Cost analysis

Sorry, your browser does not support SVG.

Targeting a single platform

Sorry, your browser does not support SVG.

Targeting two platforms, preferring one

Sorry, your browser does not support SVG.

How to analyze cost

ElasticSearch queries are not exactly compiled. Lucene AST Query.

  • Elastic does not provide documentation or easy API access to internals before execution.
  1. Parse program text
  2. Optimize AST (if required)
  3. Traverse AST
  4. (Rearrange/optimize/error)

Our cost analysis is generated from the parse tree directly.

Is it efficient?

It's a recursive tree. Costs are calculated from children in a single pass, which leads to \(O(n)\).

More sophisticated analysis possible.

What does this depend on?

The cost grammar.

Make nodes only depend on themselves and their children.

Our grammar:

GET _searchv?size=0
{"aggs": { /* root aggregation */
  "sessions": { /* sub-aggregation */
    "terms": { "field": "sessionId", "size": 10 },
    "aggs": {
      "byDay": {/* Leaf aggregation */
	"date_histogram": { "field": "@date", "interval": "day" }
      }
    }
  }
}}

Improvements

  • Better model: include synthetic nodes
    • Caching
    • Network transmission of shard data
    • Coordination
    • Reduction steps

Not available. Elastic does not expose enough API surface.

Search engine would need to expose "compiled ASTs" a state where steps can be traced to the execution model effectively. À-la Spark's Catalyst Optimizer.

Talk is cheap. Show me the code. – Linus Torvalds

Technology used

  • Scala 2.12
  • json4s-native (to avoid conflicts with Jackson versions)
  • typesafe-config
  • pureconfig
  • Akka, optionally

Why Scala

  • More expressive Type System than Java.
  • Faster development in functional programming
  • Pureconfig for configuration (relies on typesafe-config but is actually typesafe)

AST Nodes

ast-nodes.png

Digression

Elastic does not even provide basic access to parsed information

/**
  * Hackity hack. For some reason, the Elastic team has not provided the
  * only useful method in all the AST: recursion.
  */
def getSubAggregations(ag: AggregationBuilder)
    : Seq[AggregationBuilder] = {
  import implicits._
  val factoriesBuilder = ag.getPrivateFieldValue[
      AggregatorFactories.Builder]("factoriesBuilder")

  factoriesBuilder.getAggregatorFactories.asScala
}

Digression

We reimplement our visibility needs. With dragons.

implicit class PrivateMethodAccessor[A](val instance: A)(
  implicit c: ClassTag[A]) {
  def getPrivateFieldValue[B](fieldName: String): B = {
    AccessController.doPrivileged(new PrivilegedAction[B] {
      override def run(): B = {
	val f: Field = c.runtimeClass.getDeclaredField(fieldName)
	f.setAccessible(true)
	f.get(instance).asInstanceOf[B]
      }
  })
}

Plugin security policy

When used as ES Plugin, quite hard security conditions:

grant {
  permission java.lang.RuntimePermission "accessDeclaredMembers";
  permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
};

Child node parsing

case agg: DateHistogramAggregationBuilder =>
  val dateExpr = agg.dateHistogramInterval.toString
  1000 / dateMathExpressionToSeconds(dateExpr)
case agg: DateRangeAggregationBuilder =>
  import scala.collection.JavaConverters._
  agg.ranges().asScala.map(range =>
      analyzeRangeDates("to", range) - analyzeRangeDates("from", range))
    .sum
case _ => configForNode.defaultNodeCost
val childrenCost = tree.children.map(_.cost)
  .foldLeft(1d)(_ + _)
val totalNodeCost = nodeCost * childrenCost

Configuration

Configuration

All deployments can be configured in the same way: config.yaml

# config.yaml
default.maxTreeHeight: 3
default.whitelistChildren: ["terms", "date_histogram"]
custom.terms.defaultNodeCost: 10
custom.date_histogram.whitelistChildren: ["terms"]

Deployment details

Deployment mechanisms

ESCOVA is a library.

Sub-projects to deploy as:

  • Elasticsearch plugin (hosted in the ES cluster)
  • Web frontend to Elasticsearch
  • Stand-alone web server without ES fallback

As a plugin…

  • Created SBT structure for building ES plugins
  • Created from the Gradle code by ES
  • Cannot guarantee stability if Elastic doesn't
  • Sample plugin structure, take us as a working example

As a web frontend

  • Intercepts all requests to /_searchv
  • Other requests go to $ELASTICSEARCH_BACKEND, if available
  • Built with Akka-http
  • Sample project, adapt to your own needs

Deploy on Kubernetes

Chart available in the repo1. Configurable at values.yaml.

service:
  name: escova
  type: ClusterIP
  externalPort: 9200
backend: # Elasticsearch backend
  enabled: false # Redirect non-escova calls to backend ES
  host: my-internal-elasticsearch
  port: 9200
# Example configuration limiting the tree size
config: |
  default.defaultNodeCost: 1
  default.maxTreeSize: 4

Next steps

  • We need a logo!
  • Future-proof for ES 7+
  • Analyze statistics about dynamic usage e.g., cross data against Profiler
  • Take measurements from data
  • Improve static analysis

Q & A / Thank you!