With the help of special neuromorphic hardware, spiking neural networks
(SNNs) are expected to realize artificial intelligence (AI) with less energy
consumption. It provides a promising energy-efficient way for realistic control
tasks by combining SNNs with deep reinforcement learning (DRL). In this paper,
we focus on the task where the agent needs to learn multi-dimensional
deterministic policies to control, which is very common in real scenarios.
Recently, the surrogate gradient method has been utilized for training
multi-layer SNNs, which allows SNNs to achieve comparable performance with the
corresponding deep networks in this task. Most existing spike-based RL methods
take the firing rate as the output of SNNs, and convert it to represent
continuous action space (i.e., the deterministic policy) through a
fully-connected (FC) layer. However, the decimal characteristic of the firing
rate brings the floating-point matrix operations to the FC layer, making the
whole SNN unable to deploy on the neuromorphic hardware directly. To develop a
fully spiking actor network without any floating-point matrix operations, we
draw inspiration from the non-spiking interneurons found in insects and employ
the membrane voltage of the non-spiking neurons to represent the action. Before
the non-spiking neurons, multiple population neurons are introduced to decode
different dimensions of actions. Since each population is used to decode a
dimension of action, we argue that the neurons in each population should be
connected in time domain and space domain. Hence, the intra-layer connections
are used in output populations to enhance the representation capacity. Finally,
we propose a fully spiking actor network with intra-layer connections
(ILC-SAN).

This article explores the potential of spiking neural networks (SNNs) combined with deep reinforcement learning (DRL) to achieve artificial intelligence (AI) with reduced energy consumption. The focus is on the task of learning multi-dimensional deterministic policies for control, which is common in real-world scenarios. The use of the surrogate gradient method allows SNNs to achieve comparable performance to deep networks. However, existing spike-based RL methods face challenges in converting firing rates to continuous action spaces due to floating-point matrix operations. To address this, the authors draw inspiration from non-spiking interneurons in insects and propose a fully spiking actor network with intra-layer connections (ILC-SAN) that eliminates the need for floating-point matrix operations. This innovative approach has the potential to enable SNNs to be directly deployed on neuromorphic hardware.

Exploring Innovative Solutions for AI with Neuromorphic Hardware

In recent years, there has been a growing interest in utilizing spiking neural networks (SNNs) for artificial intelligence (AI) tasks, thanks to their potential for energy efficiency. By combining SNNs with deep reinforcement learning (DRL), researchers have been able to tackle complex control tasks effectively. In this article, we propose a novel approach to training SNNs for multi-dimensional deterministic policies, with the goal of achieving even greater energy efficiency.

The Challenge of Spike-Based RL Methods

Traditional spike-based RL methods rely on using the firing rate of neurons as the output of SNNs. This firing rate is then converted into a continuous action space representation (i.e., a deterministic policy) using a fully connected (FC) layer. However, the use of decimal firing rates necessitates floating-point matrix operations in the FC layer.

This reliance on floating-point computations poses a challenge when it comes to deploying SNNs on neuromorphic hardware directly. These special types of hardware are designed to mimic the functioning of the human brain, emphasizing energy efficiency and parallel processing. To leverage the benefits of neuromorphic hardware, we need a fully spiking actor network that doesn’t depend on floating-point matrix operations.

Inspired by Non-Spiking Interneurons

To tackle this challenge, we draw inspiration from non-spiking interneurons found in insects. These interneurons use the membrane voltage to represent actions, allowing for efficient communication and computation without the need for floating-point operations.

In our proposed approach, we introduce multiple population neurons before the non-spiking neurons. Each population neuron is responsible for decoding a different dimension of actions. By connecting neurons within each population in both the time domain and space domain, we can enhance the network’s representation capacity.

Introducing Intra-Layer Connections

To further optimize the representation capacity, we utilize intra-layer connections (ILCs) in the output populations. These connections between neurons within the same layer help improve the network’s ability to represent multi-dimensional actions effectively.

With our proposed fully spiking actor network with intra-layer connections (ILC-SAN), we eliminate the need for floating-point matrix operations and achieve greater energy efficiency. This advancement brings us closer to realizing realistic control tasks with AI, all while significantly reducing energy consumption.

Conclusion

The combination of spiking neural networks and deep reinforcement learning holds enormous potential for revolutionizing AI. By exploring innovative solutions and leveraging the efficiency of neuromorphic hardware, we can overcome the limitations of traditional spike-based RL methods.

With our proposed fully spiking actor network with intra-layer connections, we present a promising approach to achieving multi-dimensional deterministic policies without relying on floating-point matrix operations.

As we continue to push the boundaries of AI research, it is crucial to explore new ideas and embrace the potential of emerging technologies. By combining neuroscience-inspired approaches with cutting-edge hardware, we can create AI systems that are not only powerful but also energy-efficient.

The use of spiking neural networks (SNNs) in combination with deep reinforcement learning (DRL) has shown promising potential for achieving artificial intelligence (AI) with reduced energy consumption. This paper focuses on the specific task of learning multi-dimensional deterministic policies for control, which is commonly encountered in real-world scenarios.

One of the challenges in utilizing SNNs for reinforcement learning is the conversion of the network’s output, typically represented by firing rates, into a continuous action space. This is usually done through a fully-connected layer, but the decimal nature of firing rates introduces floating-point matrix operations that are not compatible with neuromorphic hardware.

To address this issue, the researchers draw inspiration from non-spiking interneurons found in insects and propose using the membrane voltage of these neurons to represent the action instead. By doing so, they eliminate the need for floating-point matrix operations and enable the deployment of a fully spiking actor network on neuromorphic hardware.

To decode different dimensions of actions, multiple population neurons are introduced before the non-spiking neurons. Each population is responsible for decoding a specific dimension of action, and to enhance the representation capacity, the neurons within each population are connected in both time and space domains. These intra-layer connections are particularly used in output populations.

As a result, the researchers propose a fully spiking actor network with intra-layer connections (ILC-SAN). This network architecture leverages the advantages of SNNs and eliminates the need for floating-point operations, making it suitable for deployment on neuromorphic hardware.

In terms of future directions, further research could explore the performance and scalability of the ILC-SAN architecture in different control tasks and compare it against other existing methods. Additionally, investigating the potential for optimizing the network’s efficiency and training algorithms specific to SNNs could contribute to enhancing its performance and energy efficiency even further.
Read the original article