Multi-objective exponential distribution optimizer (MOEDO): a novel math-inspired multi-objective algorithm for global optimization and real-world engineering design problems
This section delves into the EDO algorithm64 fundamental components. We will first discuss the inspiration behind EDO, followed by a detailed explanation of its initialization. We will also examine the explorative and exploitative features of EDO before discussing its mathematical structure.
Exponential distribution optimizer (EDO)
Exponential distribution model
The EDO approach takes its foundational principles from the theories of exponential distribution. This distribution, which is continuous in nature, has been instrumental in explaining several occurrences in the natural world. For instance, the time span between the present moment and the occurrence of an earthquake can be represented using this distribution. Similarly, the time-based likelihood of a car reaching a toll station aligns with the exponential distribution. Exponential random variables have frequently been utilized to shed light on past events, particularly focusing on the duration leading up to a specific incident. Here, we delve into the mathematical construct of the exponential distribution and elucidate its unique attributes. Imagine having an exponential random variable, represented by \(x\), associated with a parameter, \(\lambda\). This relationship can be expressed as \(x\sim \textEXP(\lambda )\). The Probability Density Function (PDF) of this variable is indicative of the duration by:
$$f(x)=\left\{\beginarraycc\lambda e^-\lambda x& x\ge 0\\ 0,& \, \textotherwise\endarray\right.$$
(1)
As time is a continuous factor, it must always hold a non-negative value, i.e., \((x\ge 0)\). Significantly, the \(\lambda\) parameter, which is always positive, denotes the rate of occurrence in the exponential distribution. The exponential distribution cumulative distribution function (CDF) can be derived through a specific formula:
$$F(x)=\left\{\beginarraycc1-e^-\lambda x& x\ge 0\\ 0,& \, \textotherwise\endarray\right.$$
(2)
A higher value of the rate of occurrence, \(\lambda\), indicates a reduced probability of the concerned random variable.
CDF function ascends, commencing from the foundational exponential rate and intensifying in proportion to the growth of the exponential random variable. For an exponentially distributed random variable, its mean \(\left(\mu \right)\) and variance \(\left(\sigma ^2\right)\) can be articulated using specific mathematical expressions:
$$\beginarrayc\mu =\frac1\lambda ,\\ \sigma ^2=\frac1\lambda ^2.\endarray$$
(3)
These expressions reveal an inverse relationship between the \(\lambda\) parameter and both the mean and variance. To put it succinctly, as \(\lambda\) increases, both the mean and variance decline. Moreover, the standard deviation \((\sigma )\) mirrors the mean value and its derivation follows a particular computation
$$\sigma =\sqrt\frac1\lambda ^2=\frac1\lambda =\mu$$
(4)
The memoryless nature of exponential distribution
One of the unique characteristics of certain statistical probability distributions is the ‘memoryless’ attribute. This implies that the chance of a forthcoming event transpiring is independent of past events. In simpler terms, previous occurrences do not influence future probabilities. The exponential distribution, which is a continuous type, encapsulates this memoryless attribute, especially when gauging the timespan before an event occurrence. When a random variable, denoted by \(x\), adheres to the exponential distribution with this memoryless feature, it means for any positive whole numbers \(s\) and \(t\) belonging to the series \(\left\\mathrm0,1,2,\dots ,\infty \right\\) the following holds true:
$$P(x>s+t\mid x\ge s)=P(x>t)\text.$$
(5)
Launching with the initial group
During the onset or the initiation stage, we cultivate a group termed (\(X_\textwinners\)), comprising \(N\) diversely valued, randomly formed solutions. To depict this search operation, we utilize an assortment of exponential distributions. Every potential solution is perceived as an embodiment of the exponential distribution. The respective positions of each solution are considered as exponential random variables conforming to this distribution. These are then structured as vectors, having a dimension \(d\).
$$X_\textwinners _i=\left[X_\textwinners \, _i,1,X_\textwinners _i,2,\dots ,X \, \textwinners _i,d\right]$$
(6)
$$X\textwinners \, s_i,j\in [lb,ub]\text, \, i=\mathrm1,2,\dots ,N\text,j=\mathrm1,2,\dots ,d.$$
(7)
In this setup, \(X_\textwinners \) symbolizes the \(j\) element of the ith candidate within the exponential distribution vector. Following this, we can define the preliminary group, \(X_\textwinners\).
$$\mathrmXwinners =\left[\beginarraycccc \, \textXwinner _\mathrm1,1& \, \textXwinner _\mathrm1,2& \dots & \, \textXwinners _1,d\\ \, \textXwinners _\mathrm2,1& \, \textXwinners _\mathrm2,2& \dots & \, \textXwinners _2,d\\ \vdots & \vdots & \vdots & \vdots \\ \, \textXwinners _N,1& \, \textXwinner _N,2& \dots & \, \textXwinners _N,d\endarray\right]$$
(8)
To randomly generate each variable within the candidate exponential distribution in the solution space, we utilize a particular formula.
$$X \, \textwinners _i,j=Ib+\textrand(ub-lb)\text.$$
(9)
Here, \(lb\) and \(ub\) delineate the lower and upper limits of the given problem. The term ‘rand’ signifies a randomly derived number within the span [0,1]. Upon concluding the initiation, we embark on the optimization phase. This leans on the exploratory and refinement skills of our proposed approach over iterative rounds. Subsequent sections will elucidate the two core techniques (exploration and refinement within EDO) for pinpointing the global pinnacle.
EDO exploitation strength
The exploitation component of EDO harnesses various facets of the exponential distribution model, including its memoryless trait, exponential rate, typical variance and average value. Additionally, a guiding solution steers the search towards the global peak. Initially in EDO, a set of random solutions is cultivated, mimicking an array of exponential distribution patterns. These solutions undergo evaluation via an objective function and are subsequently ranked in terms of performance. For maximization problems, they are ordered in descending efficiency, while for minimization challenges, they are arranged in ascending order. Areas surrounding a robust solution are fertile grounds for pinpointing the global apex. This is why several algorithms probe spaces around robust solutions, drawing weaker ones towards them. Therefore, the global apex quest focuses on the guiding solution. The guiding solution, labeled (\(Xguide_\text~^\texttime\)), is derived from the mean of the top three solutions in a sorted group.
$$Xguide^\texttime~ = \frac{{\textXwinners~~_\textbest~1^\texttime~ + \text~Xwinners~~_\textbest~2^\texttime~ + \text~Xwinners~_\textbest~3^{{{\texttime~}}} }}3$$
(10)
In this context, time replaces the term iteration, alluding to the period until the subsequent event in the exponential distribution. The maximum number of iterations is symbolized by Max_time. This guiding solution offers invaluable insights about the global apex. Instead of exclusively leaning on the current best solution, the guiding one is prioritized. Though the leading solution has vital details about the global peak, an exclusive focus on it could lead to convergence around a local maximum. By introducing a current guiding solution, this challenge is mitigated. To emulate the memoryless characteristic of the exponential distribution, a matrix, termed memoryless, is formulated. This matrix houses the latest solutions, irrespective of their present efficiency scores. Initially, it mirrors the original population. Following this, solutions generated at the present time, regardless of their efficiency, are stored in the matrix, dismissing their past contributions. Operating in line with the memoryless property, if current solutions do not fare well against their counterparts in the original group, their success probabilities in subsequent iterations match those of the present. Therefore, past failures do not dictate future outcomes. Within the memoryless matrix, there are two categories of solutions: winners and losers. While losers can still contribute to the optimization process alongside winners, a solution is deemed victorious if its efficiency surpasses that of its counterpart in the \(X_\textwinners \) group. If a solution is updated in both the \(X_\textwinners \) and memoryless matrices, it a winner, otherwise, it a loser.
Our designed exploitation model, focused on updating solutions adhering to the exponential distribution, relies on both winning and losing solutions. The updated solution, \(\left(V_i^timx+1\right)\) is modelled around a specific involving random and adaptive parameters. This strives to locate the global peak near an efficient solution, creating a new solution for future populations. The delves deep, examining how winners and losers navigate within the search space, leveraging valuable data from winners:
$$V_i^time \, +1=\left\{\beginarrayl \, \texta. \, \left( \, \textmemoryless _i^time -\sigma ^2\right)+ \, \textb.Xguide ^time \, \textif \, \textXwinners _i^time \text= \, \textmemoryless _i^time \, \\ \, \textb. \, \left( \, \textmemoryless _i^time -\sigma ^2\right)+\textlog\left(\phi \right).\textXwinners _i^time , Otherwise\\ \endarray\right.$$
(11)
$$a=(f)^10,b=(f)^5, f=2\times \, \textrand \, -1$$
(12)
Additionally, the exponential rate, in relation to the mean, can be deduced:
$$\lambda =\frac1\mu ,$$
(13)
$$\mu =( \, \textmemoryless _i^time + \, \textXguide ^time )/2$$
(14)
The exponential mean arises from averaging the guiding solution and the associated memoryless vector, which can be either a winner or loser.
EDO exploration potential
This section delves into the exploration aspect of the introduced algorithm. The exploration segment pinpoints potential areas within the search domain that likely house the ideal, global solution. The EDO exploratory blueprint derives from two prominent solutions, or winners, from the primary population, which adhere to the exponential distribution pattern:
$$V_i^time \, +1=\textXwinners _i^time \, -M^time \, +\left(cZ_1+(1-c)Z_2\right)$$
(15)
$$M^time \, =\frac1N\cdot \sum_j=1^N \textXwinners_j,i^time \, , j=\mathrm1,2,\dots ..,d$$
(16)
Updating the new solution involves with \(M^time\) representing the average of all solutions sourced from the primary group. This average is computed by adding all exponential random variables from the same dimension and then dividing by the total population, denoted as \(N\). The term \(c\) signifies a refined parameter, indicating the proportion of information drawn from the \(Z_1\) and \(Z_2\) vectors towards the contemporary solution and is formulated as:
$$c=d\times f,d=\frac1- \, time \, \, \textMax\_time$$
(17)
Here, \(d\) stands as a flexible parameter. Initially set to zero, it undergoes gradual decrement as time progresses. Here, ‘time’ alludes to the present moment, while ‘Max_time’ signifies the comprehensive duration or iterations. Both \(Z_1\) and \(Z_2\) are viewed as potential vectors, formed by:
$$Z_1 = M – D_1 + D_2 ,Z_2 = M – D_2 + D_1 ,D_1 = M – \textXwinners_rand1 ,D_2 = M – \textXwinners_rand2$$
(18)
Furthermore, \(D_1\) and \(D_2\) delineate the distance between the average solution and the ‘winners’ randomly picked from the initial population. At the inception of the optimization journey, a notable disparity exists between the average solution and the standout performers. Yet, as the process nears its end, this gap between the prominent solutions and their corresponding variances narrows. To establish the \(Z_1\) and \(Z_2\) vectors, exploration is conducted around the average solution, aided by a pair of randomly chosen outstanding solutions.
Optimizing with exponential distribution optimizer (EDO)
The EDO method we are introducing follows a series of steps to thoroughly navigate the search space, aiming for the global optimum. Initially, we create a collection of solutions, randomly generated and marked by a wide range of values. The search process is depicted using various exponential distributions and as such, the location of each solution can be seen as random variables adhering to this distribution. We design a matrix without memory to mimic the absence of memory and initially, it mirrors the original group of solutions. Leveraging the exploratory and refining stages of our method, every solution starts moving closer to the global optimum over time. During the refining stage, the matrix without memory is used to store the outcomes from the prior step, irrespective of their past, allowing them to play a pivotal role in shaping the new solutions. This leads to the categorization of solutions into two groups: the successful ones and the unsuccessful ones. Additionally, we incorporate various properties of the exponential distribution, like its mean, rate and variance. The successful solution is guided by a leading solution, while the unsuccessful one follows the successful one, aiming to discover the global optimum nearby. In the exploration stage, the fresh solution is influenced by two randomly chosen successful solutions from the initial group and the average solution. At the start, both the average solution and its variance are distant from the global optimum. However, the gap between the average solution and the global optimum narrows down until it reaches its lowest point during the optimization. A toggle parameter determines whether to embark on the exploration or refining stage, based on a probability where \((a<0.5)\).
$$V_i^time \, +1=\left\{\beginarrayl \, \texta. \, \left( \, \textmemoryless \, _i^time \, -\sigma ^2\right)+ \, \textb.Xguide \, ^time \, \, \textif \, \textXwinners \, _i^time \, \text= \, \textmemoryless \, _i^time \, \, \\ \, \textb. \, \left( \, \textmemoryless \, _i^time \, -\sigma ^2\right)+\textlog\left(\phi \right).\textXwinners \, _i^{time \, }, Otherwise if (a<0.5)\\ \textXwinners \, _i^{{time} \, }-M^{{{time}} \, }+\left(cZ_1+\left(1-c\right)Z_2\right), Otherwise\endarray\right.$$
(19)
After crafting the new solutions, each solution boundaries are verified and then they are stored in the matrix without memory. A selection strategy is employed to incorporate the new solutions from both stages into the initial group. If a new solution proves beneficial, it integrated into the primary group. By the optimization conclusion, all solutions cluster around the global optimum. In the best solution, both the mean and variance are anticipated to be minimal, while the scale parameter λ is expected to be significant. The pseudo code of single objective EDO shown in Algorithm 1.
Proposed multi-objective exponential distribution optimizer (MOEDO)
Preliminaries of multi-objective optimization
In multi-objective optimization tasks (MOPs), there is a simultaneous effort to minimize or maximize at least two clashing objective functions. While a single-objective optimization effort zeroes in on one optimal solution with the prime objective function value, MOO presents a spectrum of optimal outcomes known as Pareto optimal solutions. An elaboration on the idea of domination and associated terminologies are illustrated in Fig. 2.
Multi-objective exponential distribution optimizer (MOEDO)
MOEDO algorithm starts with a random population of size \(N\). the current generation is \(t, x_i^t\) and \(x_i^t+1\) the \(i\)th individual at \(t\) and \((t+1)\) generation. \(u_i^t+1\) the \(i\)th individual at the \((t+1)\) generation generated through the EDO algorithm and parent population \(P_t\). the fitness value of \(u_i^t+1\) is \(f_i^t+1\) and \(U^t+1\) is the set of \(u_i^t+1\). Then, we can calculate \(x_i^t+1\) according to \(u_i^t+1\) generated through the EDO algorithm and Information Feedback Mechanism (IFM)72 Eq. (20).
$$x_i^t+1=\partial _1u_i^t+1+\partial _2x_k^t; \partial _1=\fracf_k^tf_i^t+1+f_k^t, \partial _2=\fracf_i^t+1f_i^t+1+f_k^t\partial _1+\partial _2=1$$
(20)
where \(x_k^t\) is the \(k\) th individual we chose from the \(t\) th generation, the fitness value of \(x_k^t\) is \(f_k^t,\partial _1\) and \(\partial _2\) are weight coefficients. Generate offspring population \(Q_t\). \(Q_t\) is the set of \(x_i^t+1.\) The combined population \(R_t=P_t\cup Q_t\) is sorted into different w-non-dominant levels \(\left(F_1,F_2,\dots ,F_l\dots ,F_w\right)\). Begin from \(F_1\), all individuals in level 1 to \(l\) are added to \(S_t=\bigcup _i=1^l F_i\) and remaining members of \(R_t\) are rejected illustrated in Fig. 3. If \(\left|S_t\right|=N\) no other actions are required and the next generation is begun with \(P_t+1=S_t\) directly. Otherwise, solutions in \(S_t/F_l\) are included in \(P_t+1\) and the remaining solutions \(N-\sum _i=0^l-1 \left|F_i\right|\) are selected from \(F_l\) according to the crowding distance (CD) mechanism, the way to select solutions is according to the CD of solutions in \(F_l\). The larger the crowding distance, the higher the probability of selection and check termination condition is met. If the termination condition is not satisfied, \(t=t+1\) than repeat and if it is satisfied, \(P_t+1\) is generated represent in Algorithm 2, it is then applied to generate a new population \(Q_t+1\) by EDO algorithm. Such a careful selection strategy is found to computational complexity of \(M\)-Objectives \(O\left(N^2M\right)\). MOEDO that incorporates proposed information feedback mechanism to effectively guide the search process, ensuring a balance between exploration and exploitation. This leads to improved convergence, coverage and diversity preservation, which are crucial aspects of multi-objective optimization. MOEDO algorithm does not require to set any new parameter other than the usual EDO parameters such as the population size, termination parameter and their associated parameters. The flow chart of MOEDO algorithm can be shown in Fig. 4.
link