Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.13 No.4 pp.442-448

Portfolio Optimization with Groupwise Selection

Namhyoung Kim*, Suvrit Sra
Department of Applied Statistics, Gachon University, Seongnam, Korea
Max Planck Institute for Intelligent Systems, Tübingen, Germany
Corresponding Author E-mail:
November 14, 2014 November 25, 2014 November 25, 2014


Portfolio optimization in the presence of estimation error can be stabilized by incorporating norm-constraints; this result was shown by DeMiguel et al. (A generalized approach to portfolio optimization: improving performance by constraining portfolio norms, Management Science, 5, 798-812, 2009), who reported empirical performance better than numerous competing approaches. We extend the idea of norm-constraints by introducing a powerful enhancement, grouped selection for portfolio optimization. Here, instead of merely penalizing norms of the assets being selected, we penalize groups, where within a group assets are treated alike, but across groups, the penalization may differ. The idea of groupwise selection is grounded in statistics, but to our knowledge, it is novel in the context of portfolio optimization. Novelty aside, the real benefits of groupwise selection are substantiated by experiments; our results show that groupwise asset selection leads to strategies with lower variance, higher Sharpe ratios, and even higher expected returns than the ordinary norm-constrained formulations.



    Modern portfolio theory originated with the seminal work of Markowitz (1951, 1991), who recognized that in an investment portfolio, one should choose assets not individually, but rather by considering how they are related to each other. The resulting ‘mean-variance’ portfolio selection, which essentially aims to minimize risk (defined in terms of minimizing variance of returns), depends strongly on good estimates of the means and covariances of the asset returns. Typically, these values are estimated by using sample means and covariances, but as strongly stressed by DeMiguel et al. (2009), estimation error in these quantities leads to poor out-of-sample performance (see also references therein). To counter this poor performance, DeMiguel et al. (2009) introduced the idea solving the traditional minimum-variance problem subject to a norm constraint on the portfolio-weight vector. This constraint was then shown to lead to portfolio strategies that often have higher Sharpe ratios than several competing strategies (DeMiguel et al., 2009).

    Intuitively, one could attribute the better performance of their norm-constrained portfolios to selection. That is, the norm constraint restricts the choice of assets to which the investment should be allocated. However, which particular assets are chosen depends somewhat arbitrarily on the constraint parameters. So, one may naturally ask whether a more careful asset allocation can lead to even better portfolios?

    This question has been addressed at a higher level by considering the notion of asset classes (Maginn et al., 2007, Chapter 5). Therein, the authors suggest that one should divide assets into classes which satisfy the following properties: first, assets within an asset class are homogeneous; second, asset classes are mutually exclusive; third, asset classes lead to diversification; forth, the different asset classes should cover a significant fraction of the investor’s wealth; and lastly, each asset class should have the capacity to absorb a significant fraction of the investor’s wealth.

    These properties of asset classes and the good empirical results of the norm-based model motivate us to ask:

    Can we combine asset classes and norms to obtain more effective portfolio optimization strategies?

    In this paper we provide one answer to this question by presenting several new models based on group norms, which, instead of merely constraining weights of individual assets, constrain the weight associated to a group of assets. Experiments reveal that such groupwise constraints are beneficial— they lead to better portfolio strategies (as indicated by higher Sharpe ratios, lower variance, etc.) than competing approaches. Groupwise selection is a simple, yet powerful generalization to the idea of norm constrained portfolios, especially because it permits use of asset classes, and it opens up the possibility to incorporate expert knowledge for deciding what grouping to use.

    Moreover, our mathematical formulation does not exclude overlapping groups; so in case expert knowledge suggests overlapping groups, the model can accommodate this knowledge. Although, as motivated in Maggin et al. (2007, Chapter 5), non-overlapping asset classes should be preferred, whereby we focus our attention on nonoverlapping groups; we will also briefly discuss examples with overlapping groups.

    The remaining part of this paper is organized as follows. In Section 2, we review the previous portfolio optimization strategies. In Section 3, we propose our portfolio optimization strategies. In Section 4, the numerical results are presented and we conclude the paper in Section 5.


    The goal of portfolio optimization is to maximize its returns with less risk. Single period portfolio optimization using the mean and variance was first suggested by Markowitz (1952).

    Markowitz’ mean-variance optimization model is a widely used tool for portfolio optimization. It can be formulated in different ways. The typical problem is as follow:

    min w 1 2 w T ˆ w ,
    subject to μ ˆ T w R ,
    AW = b,
    Cw ≥ d

    in which w ∈ℝN is the vector of portfolio weights, ˆΣ ∈ℝN×N is the estimated covariance matrix, w T ˆ w is the variance of the portfolio return, μ is the estimated asset returns.

    The other one is a risk-adjusted formulation.

    max w μ ˆ T w λ 2 w T w ,
    subject to Aw = b
    S r t Cw d ,

    where λ is a risk-aversion constant. Recently, some researchers focus on minimum-variance portfolio, because of the estimation error associated with the sample mean. Like previous authors, we too focus on minimum-variance portfolios, noting however that our framework easily applies to other formulations, such as mean-variance.

    In the absence of shortsale constraints, the minimumvariance portfolio is the solution to the following problem:

    min w 1 2 w T ˆ w ,
    subject to wT e = 1,

    in which e ∈ℝN is the vector of ones.

    DeMiguel et al. (2009) suggested the additional constraint that the norm of the portfolio-weight vector smaller than a particular value to solve the traditional minimumvariance problem. The p-norm-constraint portfolio is the solution to the problem (8) and (9) subject to the additional constraint on the lp -norm of the portfolio-weight vector.

    W p δ ,

    in which δ is a threshold and it can be calibrated using cross-validation. They showed that the framework nests the shrinkage approaches (Jagannathan and Ma, 2003; Ledoit and Wolf, 2003, 2004) and 1/N portfolio. The normconstraint portfolio often showed better performance than other portfolios in the literature in terms of out-of-sample Sharpe ratio, although it was accompanied by higher turnover.

    We propose more general framework than the norm constrained framework of DeMiguel et al. (2009); all of their formulations can be obtained as special cases of our framework.


    As mentioned above, Markovitz’ mean-variance optimization problem can be formulated with some constraints like shortsale constraint (Jagannathan and Ma, 2003), norm constraint (DeMiguel et al., 2009). One can also reduce sector risk by adding the constraint

    i in sector k w i m k ,

    where mk is the maximum that can be invested in sector k. However, the more constraints make the objective value degenerates (Cornuejols and Tutuncu, 2007). In this paper, we suggest new portfolio optimization strategies using group-norms.

    A mixed-norm aka group-norm is usually defined over a set of parameter vectors, where the individual parameters form ‘groups’ and each group’s penalty or contribution may be measured using a different lp -norm. This notion is described formally in Definition 1–4 below.


    DEFINITION 1 (Mixed-norm: Vectors).

    Let w ∈ℝd be partitioned into the set {wt: wt ∈ ℝdt , 1 ≤ t ≤ n} of (column) vectors. We define the mixed lp,q -norm (read as p norm of q norms) (1 ≤ p, q ≤ ∞) for w as

    w p , q = w 1 q ; w 2 q ; ...; w n q p .

    That is, we compute the lp -norm of the vector of length n formed by computing lp -norms of the individual vectors wt (1 ≤ t ≤ n) (Note: The definition (11) can be further generalized, e.g., if we take the lqt -norm ||wn||qt ).

    This generalization comes up for example, while studying Lp-nested symmetric distributions (Bethge et al., 2009).

    DEFINITION 2 (Mixed-norm: Matrices).

    For a matrix W ∈ℝd×n , we define the mixed-norm to be the lp -norm of the lq -norms of the rows; i.e.,

    W p , q = w 1 q ; w 2 q ; ...; w d q p

    We define the mixed-norm over rows rather than columns, because usually in multi-task setups one wishes to enforce sparsity across the same feature for multiple tasks, whereby, the same ‘row’ across all tasks (columns of W) is penalized. For example, the l1, ∞ norm of W ∈ ℝd×n is

    W 1 , = i = 1 d w i

    DEFINITION 3 (Group-norm).

    Let w be as in Definition 1. We define the groupnorm as (1 ≤ p ≤ ∞)

    w Gr p = w 1 K 1 ; w 2 K 2 ; ...; w n K n p ,

    i.e., the lp -norm of a vector formed by taking Hilbert- Schmidt norms parameterized by the positive-definite matrices Kt ∈S++dt (1 ≤ t ≤ n). For example, if Kt ∈Idt and p = 1, then (14) becomes

    w Gr 1 = w 1 , 2 = t = 1 n w t 2 .

    DEFINITION 4 (Mixed quasi-norms).

    In Definitions 1 or 2 we permit any row or subvector (as the case may be) to have norm measured by the l0 - quasi-norm, we obtain a mixed quasi-norm. The most important instances are: ||W||0,0, ||W||0,p, ||W||p,0, ||W||0,p or ||W||p,0 where 1 ≤ p ≤ ∞.

    In this paper, we propose new portfolio optimization strategies using group-norms. The formulation is as follow.

    min w 1 2 w T ˆ w ,
    subject to w T e = 1 ,
    w p , q δ ,

    where ||W||p,q is a group-norm. To apply group-norms, we should partition the stocks into several groups. We use random grouping and k-means clustering algorithm. To implement k-means clustering algorithm, the sample return and variance of assets are used as attributes. We also allow overlapping groups using soft k-means algorithm. Each group is assumed to represent an asset class.

    3.2.Interpretation of the Group-Norms

    Additional group-norm constraint can be interpreted in various ways. First is introduction of groupwise selection for portfolio optimization. This idea actually goes much further than imagined. Second is explanation of the modeling power of the groupwise selection. For example, in the Black-Litterman model, one of the first steps that is suggested is to divide the assets into various ‘asset classes’, and Maginn et al. (2007) suggest that these asset classes should satisfy:

    • Assets within a class should be homogeneous (same lq -norm used for all elements within a group);

    • Assets should be mutually exclusive (non-overlapping variables for groups);

    • Asset classes should be diversifying (e.g., use of l1,∞ - norm promotes diversity, while being homogeneous within a group respectively asset class);

    • Asset class should have capacity to absorb significant fraction of the investor’s wealth (again the l1,∞ -norm makes sense, because the ∞ portion distributes wealth across the assets within the class; since l1 -favors sparsity, one could use other norms, such as l2,∞ to absorb greater fraction of wealth);

    • Asset classes taken together should make up significant portion of investor’s wealth (if we cover all the possible assets this requirement gets addressed automatically; by using l1 -norm, we promote sparsity, so that overly diffuse investments are not made, but rather a few ‘groups’ or ‘asset classes’ get selected, and in them the investments are made).

    Our model of groupwise selection, however, goes beyond these prescriptions of Miginn et al. (2007), and permits arbitrary overlapping groups. That is, the same asset may be part of more than one asset class; even though mathematically we permit this, we do not have a natural interpretation for such grouping, except if one interprets the ‘asset classes’ as ‘hedging’ the groups themselves— so that an erroneous asset classification does not have too severe a negative impact on the investment. Last interpretation is mathematical formulation, as well as algorithms for efficiently solving the associated optimization problems.


    For the application, we give an example of the results. We used weekly historical data to estimate the means and covariances of asset returns. The data was taken from Yahoo! Finance ( The portfolio is composed of S&P 500 components shares. The S&P 500 is widely regarded as the best representations of the US stock market and a leading indicator of business cycles. It consists of the common stock listed on the NYSE or NASDAQ. The total number of shares is 500 but we use 466 shares. The sample covered the period from August 16, 2004 through August 2, 2010. We estimate sample mean and variance every Monday. An overview of the data is given in Table 1.

    To evaluate the performance of the proposed method with N available assets, we compute the out-of-sample variance, Sharpe ratio, and turnover as following:

    σ ˆ i 2 = 1 T τ 1 t = τ T 1 w t i T r t + 1 μ ˆ i 2 ,
    With μ ˆ i = 1 T τ t = τ T 1 w t i T r t + 1 ,
    SR ˆ i = μ ˆ i σ ˆ i ,
    Turn over = 1 T τ 1 t = τ T 1 j = 1 N w j , t + 1 i w j , t + i ,

    where τ is the length of the estimation window, T is the total number of returns in the dataset, wit is portfolioweight vectors for each strategy i, and rt denotes the asset returns. In the definition of turnover, , wij,t is the portfolio weight in asset j at time t for strategy i, , wij,t+ is the portfolio weight before rebalancing.

    We use the rolling-window procedure for the comparison. For our empirical test, we use an estimation window of τ = 120, which corresponds to thirty months for weekly data. We compare empirically the out-of-sample performance of group-norms portfolios to other strategies. The portfolios we evaluate are listed in Table 2.

    MINU, NC1V, and NC2V are models for minimumvariance portfolio. MEAN, NC1M, and NC2M are models for mean-variance portfolio.

    To develop the proposed portfolio in this paper, we should select the number of groups, the value of threshold δ and the orders of group-norm, p and q. We make two, five, and ten groups and use group norm with p = 1, q = 2, 3, ∞. The threshold was calibrated with cross-validation (Efron and Gong, 1983; Campbell et al., 1997, Section 12.3.2).

    Tables 3 and 4 show the out-of-sample performance of each strategies. Table 3 shows performance of minimum- variance portfolios when the objective function is min min w 1 2 w T ˆ w The results of mean-variance portfolio are presented in Table 4. The estimation error associated with the sample mean is relatively large so extensive empirical evidence shows that the minimum-variance portfolio often performs better than mean-variance portfolio.

    This table reports the weekly out-of-sample mean, Sharpe ratio, variance and turnover for minimum-variance portfolio, min w 1 2 w T ˆ w .

    This table reports the weekly out-of-sample mean, Sharpe ratio, variance and turnover for mean-variance portfolio, max w μ ˆ T w λ 2 w T ˆ w .

    From Table 3 and 4 we see that the portfolios with non-overlap groups are usually better than the portfolio with overlap groups. Comparing the benchmark portfolios listed in Table 2, the portfolios developed in this pa per show reasonable results in out-of-sample performance. If we select optimal number of groups, p and q, the performance of the proposed method will be better than the norm-constraint portfolios.3


    We provided a general unifying framework for portfolio optimization. This paper contributes to the literature on portfolio optimization strategies. First, our group-norm constraint model is a general form including other constraint portfolio strategies, such as shortsale constraint model (Lintner, 1965; Jagannathan and Ma, 2003), normconstraint model (DeMiguel et al., 2009), and simple 1/N model. Second, the portfolios developed in this paper often show better mean return and Sharpe ratio than the existing portfolios, although the higher Sharpe ratio is accompanied by higher turnover. Lastly, the proposed model permits use of asset classes naturally. The asset classes lead to diversification of portfolio. This property of asset classes and the good empirical results of the norm-constraint model lead the better performance.

    For further research, the strategies of grouping assets can be analyzed. In this research, we use simple random grouping and k-means clustering algorithm using sample mean and variance without considering expert’s knowledge and industrial properties of assets. Especially, as mentioned earlier, it is more difficult to estimate means so the skill to forecast expected returns is needed for better performance. The optimal number of groups should also be studied.4



    Data set description

    List of benchmark portfolios

    Portfolio performance comparison

    Portfolio performance comparison


    1. Bethge M , Simoncelli E. P , Sinz F. H (2009) Hierarchical modeling of local image features through Lp-nested symmetric distributions, pp.1696-1704
    2. Campbell J. Y , Lo A. W , MacKinlay A. C (1997) The Econometrics of Financial Markets, Princeton University Press,
    3. Cornuejols G , Tutuncu R (2007) Optimization Methods in Finance, Cambridge University Press,
    4. DeMiguel V , Garlappi L , Nogales F. J , Uppal R (2009) A generalized approach to portfolio optimization: improving performance by constraining portfolio norms , Management Science, Vol.55 (5) ; pp.798-812
    5. Efron B , Gong G (1983) A leisurely look at the bootstrap, the jackknife, and cross-validation , The American Statistician, Vol.37 (1) ; pp.36-48
    6. Jagannathan R , Ma T (2003) Risk reduction in large portfolios: why imposing the wrong constraints helps , The Journal of Finance, Vol.58 (4) ; pp.1651-1684
    7. Ledoit O , Wolf M (2003) Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , Journal of Empirical Finance, Vol.10 (5) ; pp.603-621
    8. Ledoit O , Wolf M (2004) A well-conditioned estimator for large-dimensional covariance matrices , Journal of Multivariate Analysis, Vol.88 (2) ; pp.365-411
    9. Lintner J (1965) The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets , The Review of Economics and Statistics, Vol.47 (1) ; pp.13-37
    10. Maginn J. L , Tuttle D. L , McLeavey D. W , Pinto J. E (2007) Managing Investment Portfolios: A Dynamic Process, Wiley,
    11. Markowitz H (1951) Portfolio selection , Journal of Finance, Vol.7 (1) ; pp.77-91
    12. Markowitz H (1991) Portfolio Selection: Efficient Diversification of Investments, Blackwell,