Brain Emulation by 2030

Over the past few years I’ve been thinking about whole brain emulation (WBE) and the required computational resources.  My conclusion is that the required technology level will be reached in the 2025 – 2030 time frame.

Although most estimates focus on calculations per second, the relevant parameters are:

  • Calculations per second
  • Memory size
  • Memory bandwidth per node
  • Inter-node communication bandwidth

Here I assume the WBE detail level to be somewhere between level 4 (spiking neural model) and level 5 (electrophysiological model) from the Whole Brain Emulation Roadmap.  The computational capacity required would be around 100 Exa-FLOPS (1e20) and the memory capacity 10 Peta-bytes (1e13).  Plugging these into the CPU and memory timeline calculators, gives an expected arrival time of 2026 and 2020 respectively for $1M.

Based on my models, it appears that memory bandwidth will be a significant bottleneck, while inter-CPU node bandwidth will not be.

I’ve created a spreadsheet where the two bandwidths are estimated.

The architecture I envision is 1 million compute nodes arranged in a 2-D cluster, 1000 on the side.  Each node will have 100 TFLOPS CPU capacity and 10 GB of memory.  Nodes are connected to the nearest 4 neighbors using a high speed bus.  Longer distances are covered by multiple hops.

The inter-node bandwidth is dependent on the distance that voltage information has to travel, which is related to the axon length.  Although some axons are very long, axon lengths probably follow a power law distribution, and long ones are rare.  Based on a 1KHz update frequency and  ~60 nodes within average axon distance, the required communication bandwidth is 250 Gb/s, which is close to the capabilities of current technology.

The local node memory bandwidth is dependent on the amount of synapse state memory and required refresh rate.  Based on a 1KHz update rate, this works out to 14 TB/s.  This figure is about 3 orders of magnitude higher than current technology.

The high memory bandwidth required makes it is likely that some kind of CPU/memory on-die integration will be required.  The memristor seems a good candidate.

It is interesting to think of the brain as a 2.x dimension structure, due to the non-zero depth and also due to the folds.  The question has been asked if this provides a barrier to implementation.  The depth aspect is not an issue, since dividing the brain on a 2-D grid will put most vertical columns on the same node.  Folds effectively reduce the length that an axon has to travel.  It seems that in the worst case two points that “should” be a meter apart (brain diameter when laid flat) are only 10 cm apart (brain actual diameter).   This may increase the inter-node bandwidth by a factor of 100, if most axons take advantage of the added dimensionality for routing.  Since the inter-node bandwidth is relatively modest, it is doubtful that even a 100 fold increase will make it the bottleneck instead of the memory bandwidth.

In sum, here are the performance requirements:

  • 1 million compute nodes in a 2-D topology
  • 100 TFLOPS per node
  • 10GB memory per node
  • 10TB/s memory bandwidth
  • 4 links to neighboring nodes at 250 Gb/s

It seems likely that this kind of machine will be available by 2030 for $1M, and possibly as early as 2025.  It would be useful to evaluate the increase in memory bandwidth over time and extrapolate to this time frame.

7 Responses to Brain Emulation by 2030

  1. cesium62 says:

    “Based on a 1KHz update rate, this works out to 14 TB/s. This figure is about 3 orders of magnitude higher than current technology.”

    This is incorrect. You are looking at a compute node with 100 TFLOPS and 10GB. Current computers are on the order of 30GFLOPS and 30GB of main memory, but 30MB of cache memory. Thus we would expect your 100TFLOPS compute node to have at least 30GB of cache memory. So you should compare to level 2 cache memory access times, not main memory access times.

    • miron says:

      That’s an excellent point. Current cache BW is 32GB/s per core. So if you have, for example, 512 cores, that’s 16TB/s, which is already good enough.

      So, 2025 seems likely.

  2. SarK0Y says:

    exascale computing seems to be misleading too badly much. let’s assume we have cluster of N computing nodes, then we scale system to 2N nodes. So shall we have 2X speed-up??? Certainly not! furthermore, the greater N the lesser speed-up we can get of 2N 😉 why it been sad that much? because we must spend processing power to get threads synchronized & backup has to be turned ON too. + needless to mention, to debug those multithreading codes need to’ve more sophisticated techniques than everything being of nowdays.

  3. miron says:

    I agree that communication bandwidth between processing nodes can lead to less than linear scaling.

    As far as I understand, the connectivity of the brain is mostly local. This means that for this application we might be okay with local links to neighboring node and still get good scaling. For longer range connectivity, you can route packets along local links, and it’s okay if it takes longer, because in the real brain it takes longer.

    Also, you probably don’t need any kind of global synchronization, since the brain is likely not globally synchronized to any significant precision. Maybe on the order of a millisecond, not under that.

    So I’m personally optimistic about scaling.

    • SarK0Y says:

      Miron, if system ain’t globally synced the gets high probability of any kind collisions: 4 instance, hallucinations, non-planned moves & so on.

  4. SarK0Y says:

    actually we haven’t got a clue why brain works so fast & stable w/ humble power consumption.

  5. SarK0Y says:

    by fact, i’m an optimist too & strongly believe that we need something like this (http://goo.gl/1HVqm) to provide technological breakthrough 🙂

Leave a Reply

Name and Email Address are required fields. Your email will not be published or shared with third parties.