_{1}

Maximum Entropy Empirical Likelihood (MEEL) methods are extended to bivariate distributions with closed form expressions for their bivariate Laplace transforms (BLT) or moment generating functions (BMGF) without closed form expressions for their bivariate density functions which make the implementation of the likelihood methods difficult. These distributions are often encountered in joint modeling in actuarial science and finance. Moment conditions to implement MEEL methods are given and a bivariate Laplace transform power mixture (BLTPM) is also introduced, the new operator generalizes the existing univariate one in the literature. Many new bivariate distributions including infinitely divisible(ID) distributions with closed form expressions for their BLT can be created using this operator and MEEL methods can also be applied to these bivariate distributions.

Bivariate distributions are useful for joint modelling and naturally fitting these distributions is a necessity for pricing in insurance and finance. For example, in finance fitting a bivariate jump diffusion model to joint returns data allowed us to price a basket option accordingly, see Ruijter and Oosterlee [

In order to find suitable bivariate parametric models, compound methods and copulas are often used. New bivariate distributions created using the traditional distribution or survival copulas are often continuous with closed form distribution functions or density functions so that methods based on likelihood functions can be applied for statistical inferences. It is also well known that many useful bivariate infinitely divisible distributions do not have closed form density function nor distribution function yet very useful for applications.

For the univariate case, the LT power mixture (LTPM) operator as introduced by Abate and Whitt [

It appears natural to extent inferences based on bivariate maximum entropy empirical likelihood (MEEL) to distributions with closed form bivariate LT (BLT) or closed form bivariate moment generating function(BMGF) which are similar to the univariate case as given by Luong [

It is also worth to mention that many bivariate distributions with a singular component in a domain such as the Bivariate exponential distribution as introduced by Marshall and Olkin [

In Section 2 some examples from actuarial science and finance are given to illustrate the usefulness of bivariate models without closed form density functions but with closed form BLTs. The BLTPM operator is introduced in Section 3 and it is shown that it can be used to create bivariate ID distribution with closed form BLT.MEEL methods are introduced in Section 3 with the proposal of two bases to generate moment conditions. The elements of the bases are based or BLT or BMGF and do not need the density functions explicitly. These bases are proposed to balance efficiencies and numerical tractability as a base with a large number of elements tend to create numerical difficulties on implementation of the MEEL methods. The asymptotic properties which already appear in the literature are restated emphasizing bases and projections of the score functions. We also discuss numerical implementation of the MEEL methods using penalty function approach as given by Luong [

In the next section, we shall give some examples to illustrate that in many situations we end up working with bivariate distributions with closed form BLT or closed from BMGF but without a closed form for distributions functions or density functions.

Example 1

The following bivariate distribution introduced by Partrat [

L X ( s , t ) = g ( L U ( s ) , L V ( t ) ) , see Partrat [

The bivariate probability generating function (BPGF) for N = ( N 1 , N 2 ) ′ is denoted by g ( s , t ) and if N = ( N 1 , N 2 ) ′ follows a bivariate Poisson distribution then its PGF is given as g ( s , t ) = e ф 1 ( s − 1 ) + ф 2 ( t − 1 ) + ф 12 ( s t − 1 ) , see Partrat [

The parameters ф 1 , ф 2 , ф 12 are all nonnegative. This is the only bivariate Poisson distribution which is infintely divisble(ID), see Dwass and Teicher [

[ g ( L U ( s ) , L V ( t ) ) ] 1 n

is a BLT for each positive integer n, using the remarks given by Abate and Whitt [

Clearly, we can differentiate L X ( s , t ) or using a conditioning argument with the random sums representations to obtain the first two moments of the vector X and these first two moments will be icluded in the set of moment conditions for MEEL methods developed subsequently in section 4. The following example gives a model which is used in finance.

Example 2

Ruijter and Oosterlee [

X = B + ∑ i = 1 N J i ,

with B following a bivariate normal distribution, B ~ N ( μ , Σ ) .

The mean vector μ = ( μ 1 , μ 2 ) ′ and the covariance matrix Σ = ( σ 11 σ 12 σ 12 σ 22 ) , the random vector B is independent of the random sum ∑ i = 1 N J i , the J i ’s are independent and identically dsitributed as J which follows a bivariate normal distribution, J ~ N ( μ J , Σ J ) with

μ J = ( μ 1 J , μ 2 J ) ′ , Σ J = ( σ 11 J σ 12 J σ 12 J σ 22 J )

This is a jump-diffusion model with the diffusion part given by B and the jump part by the random sum ∑ i = 1 N J i . Also, the J_{i}’s, B, Z and N are independent with N following a Poisson distribution with parameter λ. Comparing to a bivariate normal model, the bivariate jump diffusion model has an extra jump component. Ruijter and Oosterlee [

M X ( s , t ) = e s ′ μ + 1 2 s ′ Σ s ⋅ exp ( λ ( e s ′ μ J + 1 2 s ′ Σ J s − 1 ) ) , − ∞ < s , t < ∞ .

The domain of M X ( s , t ) is the entire plane due to the use of the Poisson and normal distributions. Clearly, this an ID BMGF and the corresponding jump diffusion process is a bivariate Lévy process. Also, the first two moments of the vector X can be extracted from the BMGF. For this model, MEEL methods can be used for estimation and model testing. The class of bivariate normal mean variance mixture as described by McNeil et al [

Furthermore, MEEL methods can also be used for testing the null composite hypothesis which specifies that the parametric model fits the data. The test statistics based on MEEL methods can be constructed in a unified way and obtained simultaneously with the estimation procedures. This feature is useful for doing applied works and the test statistics is less complicated than statistics based on the Mahanolobis distance, see Mc Assey [

These properties of bivariate MEEL methods are similar to properties of univariate MEEL methods as discussed by Luong [

The Laplace transform power mixture operator (LTPM) has been introduced in the literature for creating univariate non-negative distribution and can be viewed as a continuous type of compounding operator and it makes use of LT of a distribution and the LT of a mixing random variable to create a new distribution specified by its LT. It is due to Abate and Whitt [

For illustrations, some examples of new bivariate distribution specified by BLT and created using the BLTPM operator will be given for illustrations in section 3.2. The BLTPM operator can also be used to create new distribution with one marginal distribution which is discrete and the other marginal being continuous. These types of bivariate distributions appear to be useful for actuarial science and possibly for other fields as well.

Often, we need to know whether a function can be considered as LT of some univariate random variable, the notion of completely monotonicity of a function is useful for characterizing a function to be a proper LT of a random variable and can be found in Feller [

Definition

A function ф ( s ) with the domain given by s ≥ 0 is completely monotone (CM) if it possesses derivatives ф ( n ) of all order and ( − 1 ) n ф ( n ) ( s ) ≥ 0 .

Theorem

A function ф ( s ) with the domain given by s ≥ 0 is the LT of a random variable if and only if it is CM.

Now if we assume that the LT of a random variable is f ^ ( s ) so that f ^ α ( s ) is a proper LT, also suppose that the LT of another random variable Y is g ^ ( t ) so that g ^ α ( t ) is a proper LT and α is a non-negative random variable with distribution function H and LT h ^ ( u ) . Furthermore, we assume that conditioning on a realized value α, the LT f ^ α ( s ) which is the LT of a random variable denoted by X ( α ) and the LT g ^ α ( t ) which is the LT of a random variable denoted by Y ( α ) are independent but when integrate out with respect to the distribution of α will give a new bivariate LT for a vector X = ( Y , Z ) ′ ,

L X ( s , t ) = ∫ 0 ∞ f ^ α ( s ) ⋅ g ^ α ( t ) d H ( α ) . (1)

Using the LT of α, it can be re-expressed as

L X ( s , t ) = h ^ ( − log ( f ^ ( s ) ) − log ( g ^ ( t ) ) ) . (2)

Expression (2) gives a definition of the BLTPM operator and it generalizes expression (6.1) given by Abate and Whitt [

M X ( s , t ) = h M ^ ( log ( f M ^ ( s ) ) + log ( g M ^ ( t ) ) ) . (3)

Observe that if X ( α ) , Y ( α ) are ID and α ≥ 0 is continuous and ID then the new bivariate distribution created is also ID. We can interpret this property using Lévy processes, as two independent Lévy processes subject to the same time change Lévy process is a bivariate Lévy process. This generalizes the argument used for the univariate case given by Abate and Whitt [

A form of bivariate PM operator, the BSDPM has been used in the literature but with the use of distribution functions or survival functions, see the seminal works of Marshall and Olkin [

In section 3.2, we shall see that additional requirements are needed when working with LT’s instead of distribution functions or survival functions for the use of these copulas traditionally used to create new bivariate distribution function using prescribed marginal distributions.

This is due to the conditions e − α ф − 1 ( L 1 ( s ) ) and e − α ф − 1 ( L 2 ( t ) ) , α > 0 are proper LT are more difficult to verify than the corresponding conditions, e − α ф − 1 ( F 1 ( x ) ) and e − α ф − 1 ( F 2 ( y ) ) are proper distribution functions with F 1 ( x ) and F 2 ( y ) being distribution functions when the BDSPM operator is used, see Marshall and Olkin [

We shall give a few examples below to illustrate the use of the BLTPM operator to create bivariate ID distribution using the LT of a gamma mixing random variable. The same procedure can be used with other mixing random variables such as the inverse Gaussian random variable or the exponential mixture inverse Gaussian (EMIG) random variable. The EMIG distribution as studied by Abate and Whitt [

Example 3

For modeling two type of claims in insurance for one period of time, we might want to construct a joint continuous model using the BLTPM operator with the mixing random variable following a gamma distribution with LT h ^ ( u ) = ( 1 + u ) − τ , τ > 0 , f ^ ( s ) and g ^ ( t ) are LT of inverse Gaussian distributions with LTs given respectively by

f ^ ( s ) = exp ( θ 1 μ 1 ( 1 − 1 + 2 μ 1 2 s θ 1 ) ) and g ^ ( t ) = exp ( θ 2 μ 2 ( 1 − 1 + 2 μ 2 2 t θ 2 ) ) ,

θ i , μ i > 0 , i = 1 , 2 see Klugman et al. [

Apply the BLTBPM operator using these LTs, this will give a new bivariate continuous distribution for a random vector X = ( Y , Z ) ′ which is ID and nonnegative with its BLT given by

L X ( s , t ) = ( 1 − θ 1 μ 1 ( 1 − 1 + 2 μ 1 2 s θ 1 ) − θ 2 μ 2 ( 1 − 1 + 2 μ 2 2 t θ 2 ) ) − τ .

From L X ( s , t ) , the Pearson product moment correlation coefficient can be obtained and the coefficient can be used to study the dependence between Y and Z. Other depence measures can also be used to study associativity and dependence, see Chapter 4 of the book by Balakrishnan and Lai [

Example 4

Bivariate mixed models with one marginal for counts and the other one for claim amounts appear to be useful in actuarial science. These models can also be constructed using the BLTPM operator. For example, let the random vector X = ( Y , Z ) ′ with Y being discrete and Z being continuous and nonnegative. We might specify h ^ ( u ) = ( 1 + u ) − τ , τ > 0 , f ^ ( s ) = exp ( λ ( e − s − 1 ) ) which is the LT of a Poisson random variable with λ > 0 , let and

g ^ ( t ) = exp ( θ μ ( 1 − 1 + 2 μ 2 t θ ) ) , θ and μ > 0 which is the LT of an inverse Gaussian distribution.

The BLT of the random vector X = ( Y , Z ) ′ created using the BLTPM operator is

L X ( s , t ) = ( 1 − λ ( e − s − 1 ) − θ μ ( 1 − 1 + 2 μ 2 t θ ) ) − τ .

Setting t = 0 we obtain the marginal LT of Y which is given by

L Y ( s ) = ( 1 − λ ( e − s − 1 ) ) − τ . One can recognize that this is a LT of a negative binomial distribution and for this distribtion, the corresponding probability

generating function (PGF) P Y ( s ) = ( 1 − λ ( s − 1 ) ) − τ , see Klugman et al. [

The marginal L Z ( t ) = ( 1 − θ μ ( 1 − 1 + 2 μ 2 t θ ) ) − τ is the LT of a continuous random variable.

Example 5

In this example we shall obtain a bivariate negative binomial distribution using

the BLTPM operator. Let h ^ ( u ) = ( 1 + u ) − τ , τ > 0 , f ^ ( s ) = exp ( λ 1 ( e − s − 1 ) ) and

g ^ ( t ) = exp ( λ 2 ( e − s − 1 ) ) , λ 1 and λ 2 > 0 . Apply the BLTPM operator this gives a

joint distribution for X = ( Y , Z ) ′ with BLT given by

L X ( s , t ) = ( 1 − λ 1 ( e − s − 1 ) − λ 2 ( e − s − 1 ) ) − τ .

This is a BLT of a bivariate negative binomial distribution which is also ID. In the literature, this distribution has been constructed using various methods and by many authors, see Mardia [

We briefly mention on how to simulate from bivariate distribution with a specified L X ( s , t ) created by the BLTPM operator. The procedures are similar to the procedures described in section 5 and given by Marshall and Olkin [

For the inverse transform method, see Ross [

The main steps are:

1) Generate an observation for the mixing random variable α from the distribution H which has LT h ^ ( u ) .

2) Use the observed value α, generate an observation X ( α ) from a distribution with LT f ^ α ( s ) . Often from the expression of f ^ α ( s ) , we have a procedure to simulate from this distribution and the inverse method might not be needed explicitly at this step.

3) Use the same observed value α, simulate another observation Y ( α ) which is independent of X ( α ) obtained from 2) from a distribution with LT g ^ α ( t ) .

The pair of observations ( X ( α ) , Y ( α ) ) ′ obtained will follow a bivariate distribution with BLT as specified by L X ( s , t ) .

Observe that the BLTPM operator given by expression (2) is similar to the BDSPM operator given by expression 2.2 in Marshall and Olkin [

F ( x , y ) = ф ( ф − 1 ( F 1 ( x ) ) + ф − 1 ( F 2 ( y ) ) ) , (4)

ф is a univariate LT and its inverse given by ф − 1 .

If we let ф = ( 1 + s ) τ which is a LT of a gamma distribution, its inverse is given by ф − 1 = s − 1 τ − 1 , then we will have the classical distribution or survival

distribution Clayton copula as shown in example 4.2 given by Marshall and Olkin [

For new bivariate distribution constructed using the BLTPM, we restrict the attention to nonegative distributions. Note that if the marginal distributions are specified respectively using LTs and given by L 1 ( s ) and L 2 ( t ) and subject to the two functions e − α ф − 1 ( L 1 ( s ) ) and e − α ф − 1 ( L 2 ( t ) ) are proper LT, a BLT of a new distribution with marginal LTs given respectively by L 1 ( s ) and L 2 ( t ) can also be obtained as

L X ( s , t ) = ф ( ф − 1 ( L 1 ( s ) ) + ф − 1 ( L 2 ( t ) ) ) , (5)

which is similar to the procedure as given by expression (4) using distribution or survival functions. This also means that some of the distribution or survival copulas of the class given by expression (4) can also be used as LT copulas. But the drawback of this approach using LT copula here is even by specifying the marginals L 1 ( s ) and L 2 ( t ) being LT of ID distributions, L X ( s , t ) constructed using expression (5) might not necessary be a BLT of a bivariate ID distribution.

It is also more difficult to simulate from a BLT L X ( s , t ) obtained using the LT copula as given by expression (5). Despite the algorithm used to generate a pair of observations from a specified BLT procedure is already given as above but when apply the procedures here we encounter the difficulty on how to simulate from distributions with LTs given by K 1 ( s ) = e − α ф − 1 ( L 1 ( s ) ) and K 2 ( t ) = e − α ф − 1 ( L 2 ( t ) ) as we might not be able to recognize these distributions.The inverse transform method can be applied by using the approximate quantile function based on moment generating functions M 1 ( s ) = e − α ф − 1 ( L 1 ( − s ) ) and M 2 ( s ) = e − α ф − 1 ( L 2 ( − t ) ) to obtain a simulated sample with some accuracy. The approximate quantile functions based on moment generating functions can be obtained explicitly using the saddle point technique and it is given by Arevelillo [

There is a vast literature on survival or distribution copula, we just mention a few here. For a good general review on distribution or survival copula models with emphasis on goodness-of-fit test statistics, see Genest et al. [

As mentioned earlier, asymptotic theory for Maximum entropy empirical likelihood (MEEL) and empirical likelihood (EL) for models where we have X 1 , ⋯ , X n a sample of multivariate observations independent and identically distributed (iid) as X which follows a d-variate multivariate distribution F β , β = ( β 1 , ⋯ , β p ) ′ are well established but assuming a set of moment conditions or constraints has been chosen, see Qin and Lawless [

For applications, the question on how to choose moment conditions or equivalently bases so that MEEL methods have high efficiencies is a relevant one and we would like to address mainly this issue here for models with closed form BLTs or BMGFs as introduced in previous sections. The moment conditions are viewed as constraints and can be identifed with elements of a chosen basis used to project the true score functions as in the univariate case, see Luong [

Let X 1 , ⋯ , X n be sample of bivariate observations and they are independent and identically distributed as X = ( Y , Z ) ′ which follows a bivariate distribution F β , β = ( β 1 , ⋯ , β p ) ′ , F β has no closed from but its BLT or BMGF has a closed form expression and given respectively by L β ( s , t ) or M β ( s , t ) .The true vector of parameters is denoted by β 0 ∈ Ω .The parameter space Ω is assumed to be compact.

Suppose that we can identify k constraints of the form g 1 ( x ; β ) , ⋯ , g k ( x ; β ) where these functions g i ( x ; β ) , i = 1 , ⋯ , k are unbiased estimating functions with the property E β ( g i ( x ; β ) ) = 0 , i = 1 , ⋯ , k , these functions also form a basis B = { g 1 ( x ; β ) , ⋯ , g k ( x ; β ) } used to project the true score functions and MEEL estimators can be viewed as equivalent to quasilikelihood estimators based on the projected score functions, see Luong [

In this section, we shall recommend a base when using a model BLT to extract moment conditions which can be used as a guideline for forming other bases if needed for applying bivariate MEEL methods.

We shall define the first five g_{i}’s as follows which make use of the first two moments of X, i.e.,

g 1 ( x ; β ) = y − E β ( z ) , g 2 ( x ; β ) = z − E β ( z ) g 3 ( x ; β ) = ( y − E β ( y ) ) 2 − V β ( z ) , g 4 ( x ; β ) = ( z − E β ( z ) ) 2 − V β ( z ) , g 5 ( x ; β ) = ( y − E β ( y ) ) ( z − E β ( z ) ) − C o v β ( y , z ) . (6)

The variances of y and z are given respectively by V β ( y ) and V β ( z ) , the covariance between y and z is denoted by C o v β ( y , z ) . Subsequently we will pick the g_{i}’s directly using the model BLT L β ( s , t ) . We shall choose first four points of the BLT L β ( s , t ) , ( s 1 , t 1 ) , ( s 2 , t 2 ) , ( s 3 , t 3 ) , ( s 4 , t 4 ) which belong to a circle of radius r 1 = 0.01 in the nonnegative quadrant and centered at the origin, i.e., let

( s 1 , t 1 ) = r 1 ( cos ( 0 ) , sin ( 0 ) ) , ( s 2 , t 2 ) = r 1 ( cos ( π 6 ) , sin ( π 6 ) )

( s 3 , t 3 ) = r 1 ( cos ( 2π 6 ) , sin ( 2π 6 ) ) , ( s 4 , t 4 ) = r 1 ( cos ( 3π 6 ) , sin ( 3π 6 ) )

which lead to define

g 6 ( x ; β ) = e s 1 y + t 1 z − L β ( s 1 , t 1 ) , g 7 ( x ; β ) = e s 2 y + t 2 z − L β ( s 2 , t 2 ) , g 8 ( x ; β ) = e s 3 y + t 3 z − L β ( s 3 , t 3 ) , g 9 ( x ; β ) = e s 4 y + t 4 z − L β ( s 4 , t 4 ) .

We might want to add another 4 points on the part of the circle of radius r 2 = 0.02 of the nonnegative quadrant and centered at the origin by letting

( s 5 , t 5 ) = r 2 ( cos ( 0 ) , sin ( 0 ) ) , ( s 6 , t 6 ) = r 2 ( cos ( π 6 ) , sin ( π 6 ) )

( s 7 , t 7 ) = r 1 ( cos ( 2π 6 ) , sin ( 2π 6 ) ) , ( s 8 , t 8 ) = r 2 ( cos ( 3π 6 ) , sin ( 3π 6 ) )

which leads to define

g 10 ( x ; β ) = e s 5 y + t 5 z − L β ( s 1 , t 1 ) , g 11 ( x ; β ) = e s 6 y + t 6 z − L β ( s 6 , t 6 ) , g 12 ( x ; β ) = e s 7 y + t 7 z − L β ( s 7 , t 7 ) , g 13 ( x ; β ) = e s 8 y + t 8 z − L β ( s 8 , t 8 ) .

If needed, pick another 4 points on the circle of radius radius r 3 = 0.03 of the nonnegative quadrant centered at the origin by letting

( s 9 , t 9 ) = r 3 ( cos ( 0 ) , sin ( 0 ) ) , ( s 10 , t 10 ) = r 3 ( cos ( π 6 ) , sin ( π 6 ) )

( s 11 , t 11 ) = r 3 ( cos ( 2π 6 ) , sin ( 2π 6 ) ) , ( s 12 , t 12 ) = r 3 ( cos ( 3π 6 ) , sin ( 3π 6 ) )

which leads to define

g 14 ( x ; β ) = e s 9 y + t 9 z − L β ( s 9 , t 9 ) , g 15 ( x ; β ) = e s 10 y + t 10 z − L β ( s 10 , t 10 ) , g 16 ( x ; β ) = e s 11 y + t 11 z − L β ( s 11 , t 11 ) , g 17 ( x ; β ) = e s 12 y + t 12 z − L β ( s 12 , t 12 ) .

With all these g_{i}’s, the basis can be formed and given by

B L T = { g 1 ( x ; β ) , ⋯ , g k ( x ; β ) } , k = 17. (7)

Therefore, if the number of parameters m in the model, m < k , bivariate MEEL methods can be used for estimation and model testing using this basis to generate constraints. Note that the choice of g i ( x ; β ) , i = 1 , ⋯ , 17 do not put restrictions on the parameter space as the model BLT is well defined on the nonnegative quadrant for all values of the vector s = ( s , t ) ′ . This might not be the case with the use of the BMGF M β ( s , t ) where the domain of s might depend on β . We have to be ensured that if restrictions are imposed, the restricted parameter space is all we need for applications. The choice of points are based on the intuitve ground that the BLT contains more information at points near the origin and obviously there is some arbitrairiness on the choice of points or equivalently moment conditions for using MEEL methods.

If we use moment conditions based on the BMGF such as estimation for the jump-diffusion model as given by example 2, in general we might want to keep the first five g i ( x ; β ) ’s as given by expression (6) for BLT and add 12 additional g_{i}’s with chosen points on the circle of radius r = 0.01 centered at the origin without restricting on the first quadrant and define

( s j , t j ) = r ( cos ( j π 6 ) , sin ( j π 6 ) ) , j = 0 , 1 , 2 , ⋯ , 11 which leads to define

g 6 ( x ; β ) = e s 0 z 1 + t 0 z 2 − M β ( s 0 , t 0 ) , g 7 ( x ; β ) = e s 1 z 1 + t 1 z 2 − M β ( s 1 , t 1 ) , ⋮ g 17 ( x ; β ) = e s 11 z 1 + t 11 z 2 − M β ( s 11 , t 11 ) .

Consequently, the base

B M G F = { g 1 ( x ; β ) , ⋯ , g 5 ( x ; β ) , g 6 ( x ; β ) , ⋯ , g 17 ( x ; β ) } (8)

which makes use of the model BMGF M β ( s , t ) can be used for generating constraints.

Note that the BMGF for the jump-diffusion model as given by example 2 is well defined for all points s = ( s , t ) ′ . Note that with a basis of k = 17 elements, it will allow the number of parameters p < 17 .It appears that such a base gives a good balance between efficiency and numerically tractability for the MEEL methods and can be used as a guideline to form other bases if necessary in practice.

Asymptotic properties for Multivariate MEEL methods and for EL methods for models with independent and identically distributed of vector of observations are well established, see Qin and Lawless, Mittelhammer et al. [

∑ i = 1 n π i * ( β ) ln π i * ( β ) (9)

with

π i * = exp ( − ∑ j = 1 k λ j ( β ) ( ∑ i = 1 n g j ( x i ; β ) ) ) ∑ i = 1 n exp ( − ∑ j = 1 k λ j ( β ) ( ∑ i = 1 n g j ( x i ; β ) ) ) , i = 1 , ⋯ , n (10)

subject to

∑ i = 1 n π i * ( λ , β ) ( g j ( x i ; β ) ) = 0 , j = 1 , ⋯ , k . (11)

The λ_{j}'s depend on β and are only implicitly defined using expression (11).

Using standard regularity conditions as indicated in the references just listed, also see section 3.2 given by Luong [

n ( β ^ − β 0 ) → p N ( 0 , Ω ) ,

Ω = [ E [ ∂ g ( x , β ) ∂ β | β = β 0 ] ( E [ g ( x , β ) g ( x , β ) ′ ] | β = β 0 ) − 1 E [ ∂ g ( x , β ) ∂ β ′ | β = β 0 ] ] − 1 , (12)

g ( x , β ) = ( g 1 ( x , β ) , ⋯ , g k ( x , β ) ) ′ , Σ ( β 0 ) = E [ g ( x , β ) g ( x , β ) ′ ] | β = β 0 .

An estimator Ω ^ for Ω is given below,

Ω ^ = [ [ ∑ i = 1 n π ^ i ∂ g ( x i , β ) ∂ β | β = β ^ ] ( ∑ i = 1 n π ^ i g ( x i , β ) g ( x i , β ) ′ | β = β ^ ) − 1 ⋅ [ ∑ i = 1 n π ^ i ∂ g ( x i , β ) ∂ β | β = β ^ ] ] − 1

π ^ i = π i * ( β ^ ) , i = 1 , ⋯ , n . If we let π ^ i = 1 n , i = 1 , ⋯ , n , we have another consistent

estimator for Ω . The asymptotic results are very similar to the ones for the univariate case as given by Luong [

For model testing, it is viewed as testing the validity of the moment conditions, i.e., tesing the null hypothesis H 0 : E β ( g j ( x ) ) , j = 1 , ⋯ , k just as in the univariate case, the expectations are under the true parametric model.

The following test statistics given below is a chi-square test statistics which has an asymptotic chi-square distribution with r = k − p degree of freedom, i.e.,

2 n K L ( π * ( β ^ ) , p n ) = 2 n ( ∑ i = 1 n π i * ( β ^ ) [ log ( π i * ( β ^ ) ) − log ( 1 n ) ] ) → L χ 2 ( k − p ) . (13)

This test might be useful for testing bivariate normality in empirical finance and it is based on the MEEL estimators β ^ and since the basis used spans the true score functions of the bivariate normal model which makes β ^ as efficient as the maximum likelihood estimator β M L ^ , as the projected score functions are identical to the score functions. Furthermore, the asymptotic chi-square distributions are practical for the use of these test statistics and we do not need intensive simulations to implement model testing procedures as in other methods. Also, MEEL methods can be applied for model with a singular part in the domain provided that the model BLT has a closed form expression.

As for the univariare case, a penalty approach can be used which means that we can perform unconstrained minimization using the following objective function with respect to λ = ( λ 1 , ⋯ , λ k ) ′ and β 1 , ⋯ , β p ,

∑ i = 1 n π i * ( λ , β ) ln π i * ( λ , β ) + K 2 [ ( ∑ i = 1 n π i * ( λ , β ) [ g 1 ( x i , β ) ] ) 2 + ⋯ + ( ∑ i = 1 n π i * ( λ , β ) [ g k ( x i , β ) ] ) 2 ] (14)

The penalty constant K is a large positive value, direct search agorithms which are derivative free are recommended and these direct search algorithms are also more stable.

It is important to note that only a local minimizer is found each time using these algorithms, some strategies are needed to identify the global minimizer. Often, we might need a starting vector being close to the vector of the estimators to initialize the algorithm, this is important when working with real data and bivariate models might have many parameters. We might want to adopt the following strategy to look for a good starting vector. Using the two submodels based on the two marginal distributions of the model separately we can fit using univariate MEEL methods using { y i , i = 1 , ⋯ , n } and { z i , i = 1 , ⋯ , n } as described in Luong [

β = ( β 1 , β 2 , β 3 ) ′ . Now if we apply MEEL methods with the bivariate model but fixing

β 1 = β 1 ( 0 ) ^ , β 2 = β 2 ( 0 ) ^ and only estimate β 3 , the vector which estimates β 3 is denoted by β 3 ( 0 ) ^ . We can construct the following starting vector

β ^ ( 0 ) = ( β 1 ( 0 ) ^ , β 2 ( 0 ) ^ , β 3 ( 0 ) ^ ) ′ , λ ^ ( 0 ) = ( 0 , ⋯ , 0 ) and fit the bivariate model using penalty method by minimizing jointly with respect to the vector λ and the vector β the objective function as given by expression (14).

In this section we perform a limited simulation study for model given by example 1. We compare the performance of the Method of moment(MM) estimators with the MEEL estimators using the following basis B M E E L with k = 9 elements which is formed using the first 9 elements of the basis given by expression (7) as the model only has five parameters given by the vector

β = ( ф 1 , ф 2 , ф 12 , β 1 , β 2 ) ′ and these parameters will be introduced below. Using the model in example by specifying an exponential distribution for U with

density function f ( u ; β 1 ) = 1 β 1 e − u β 1 , β 1 > 0 and specifying another exponential distribution for V with density function f ( v ; β 2 ) = 1 β 2 e − u β 2 , β 2 > 0 . The random

vector N = ( N 1 , N 2 ) ′ follows a bivariate Poisson distribution with parameters ф 1 , ф 2 , ф 12 and these parameters all all positive. We have n independent and identically distributed observations

X i = ( Y i , Z i ) ′ , i = 1 , 2 , ⋯ , n . They have the same distributions as X = ( Y , Z ) ′ and using the model as specified, the following first two moments of X can be obtained, they are also given by Partrat [

E ( Y ) = ф 1 β 1 , E ( Z ) = ф 2 β 2 , V ( Y ) = 2 ф 1 β 1 2 , V ( Z ) = 2 ф 2 β 2 2 , C o v ( Y , Z ) = ф 12 β 1 β 2 (15)

where E ( . ) , V ( . ) and C o v ( . ) denote respectively the mean, the variance and covariance of the expression inside the parenthesis. Now if we replace E ( Y ) , E ( Z ) , V ( Y ) , V ( Z ) and C o v ( Y , Z ) by their sample counterparts which are given by Y , Z ¯ , s Y 2 , s Z 2 and s Y Z in expression (15) and by solving the system of equation,it will give us the MM estimators given by the vector β ˜ = ( ф 1 ˜ , ф 2 ˜ , ф 12 ˜ , β 1 ˜ , β 2 ˜ ) ′ with

β 1 ˜ = s Y 2 2 Y ¯ , β 2 ˜ = s Z 2 2 Z ¯ , ф 1 ˜ = Y ¯ β 1 ˜ , ф 2 ˜ = Z ¯ β 2 ˜ , ф 12 ˜ = s Y Z β 1 ˜ β 2 ˜

For MEEL methods we use the basis B M E E L = { g 1 , ⋯ , g 9 } and we only keep the first 9 elements of the basis B as given by expression (7).

The model BLT is

L X ( s , t ) = exp ( ф 1 ( 1 1 + β 1 s − 1 ) + ф 2 ( 1 1 + β 2 t − 1 ) + ф 12 ( 1 1 + β 1 s 1 1 + β 2 t − 1 ) ) ,

see Partrat [

We conduct a limited simulation study using this model, to simulate an observation from the bivariate Poisson distribution we use a trivariate reduction technique which is based on the following representation in distribution. We simulate first an observation R 12 from a Poisson distribution with parameter λ 12 then simulate an independent observation R 1 which follows a Poisson distribution with parameter λ 1 and another observation which is independent of the first two observations which follows a Poisson distribution with parameter λ 2 . Finally, we construct the vector of observations obtained by simulations as

N = ( N 1 , N 2 ) ′ with its components given by N 1 = d R 1 + R 12 with N 2 = d R 2 + R 12 and the equalities hold in distribution. The vector N = ( N 1 , N 2 ) ′ will follow a bivariate Poisson distribution with parameters ф 1 = λ 1 + λ 12 , ф 2 = λ 2 + λ 12 , ф 12 = λ 12 , see Johnson et al. [

We use M = 100 samples and each of the sample is of size n = 1000 . For the range of parameters we fix β 1 = β 2 and let β 1 varies from 1, 2, …, 10. For the parameters of the bivariate Poisson distribution we fix ф 12 = 0.1 , ф 1 = ф 2 and let ф 1 = 1.1 , 3.1 , 4.1 , 5.1 , 6.1 , 8.1 , 10.1 . Similarly, we fix β 1 = β 2 and let β 1 varies from 1, 2, …, 10 and we fix ф 12 = 0.2 , ф 1 = ф 2 and let ф 1 = 1.2 , 3.2 , 4.2 , 5.2 , 6.2 , 8.2 , 10.2 .

The mean square errors (MSE) are estimated using simulated samples. The mean square error of an estimator π ^ for π 0 is defined as

M S E ( π ^ ) = E ( π ^ − π 0 ) 2 . We use the following criterion for comparison between the efficiencies of MEEL estimators versus MM estimators

A R E = M S E ( ф 1 ^ ) + M S E ( ф 2 ^ ) + M S E ( ф 12 ^ ) + M S E ( β 1 ^ ) + M S E ( β 2 ^ ) M S E ( ф 1 ˜ ) + M S E ( ф 2 ˜ ) + M S E ( ф 12 ˜ ) + M S E ( β 1 ˜ ) + M S E ( β 2 ˜ )

The results are displayed in

ф 1 / β 1 | 1 | 2 | 3 | 4 | 5 | 6 | 8 | 10 |
---|---|---|---|---|---|---|---|---|

1.1 | 0.5335 | 0.4046 | 0.3099 | 0.3170 | 0.2127 | 0.1783 | 0.1514 | 0.1941 |

3.1 | 0.5033 | 0.4405 | 0.4408 | 0.4868 | 0.4508 | 0.3161 | 0.2532 | 0.2392 |

4.1 | 0.3655 | 0.4629 | 0.4102 | 0.3506 | 0.2924 | 0.3418 | 0.2921 | 0.2859 |

5.1 | 0.4405 | 0.4374 | 0.3374 | 0.3913 | 0.3246 | 0.3063 | 0.3119 | 0.2887 |

6.1 | 0.3717 | 0.3741 | 0.4499 | 0.3476 | 0.3316 | 0.3269 | 0.2876 | 0.3398 |

8.1 | 0.3920 | 0.3778 | 0.3682 | 0.3296 | 0.2768 | 0.3011 | 0.2322 | 0.2613 |

10.1 | 0.3221 | 0.3597 | 0.2950 | 0.3032 | 0.3414 | 0.2836 | 0.2186 | 0.2562 |

ф 1 / β 1 | 1 | 2 | 3 | 4 | 5 | 6 | 8 | 10 |
---|---|---|---|---|---|---|---|---|

1.2 | 0.4735 | 0.5555 | 0.4253 | 0.3800 | 0.4123 | 0.4077 | 0.4708 | 0.7260 |

3.2 | 0.4654 | 0.5085 | 0.3709 | 0.3865 | 0.4153 | 0.4300 | 0.3662 | 0.4414 |

4.2 | 0.3899 | 0.4755 | 0.3437 | 0.2910 | 0.3252 | 0.3526 | 0.3929 | 0.2642 |

5.2 | 0.3817 | 0.4329 | 0.3574 | 0.3301 | 0.2530 | 0.3645 | 0.3737 | 0.4066 |

6.2 | 0.3910 | 0.3654 | 0.4597 | 0.3235 | 0.3505 | 0.2529 | 0.3301 | 0.3078 |

8.2 | 0.5020 | 0.3809 | 0.3976 | 0.3671 | 0.3014 | 0.3129 | 0.2254 | 0.2506 |

10.2 | 0.4410 | 0.4415 | 0.3799 | 0.2931 | 0.3213 | 0.2506 | 0.2382 | 0.2700 |

we do not have large computing resources but it does confirm the asymptotic results on efficiencies of MEEL estimators. More works need to be done for assessing the efficiencies of MEEL methods by using various parametric models especially in finite samples.

MEEL estimators using the proposed moment conditions or bases appear to have the potential of higher efficiencies than MM estimators in general. The methods also provide chi-square goodness-of-fit test statistics for model testing. These features make the methods useful for inferences for bivariate distributions with closed form BLTs or BMGFs without closed form for the bivariate density functions. For these distributions, the implementation of likelihood methods might be difficult.

The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support form the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.

Luong, A. (2018) Maximum Entropy Empirical Likelihood Methods Based on Bivariate Laplace Transforms and Moment Generating Functions. Open Journal of Statistics, 8, 264-283. https://doi.org/10.4236/ojs.2018.82017