DSpace Collection:http://hdl.handle.net/10722/386632024-03-19T11:58:51Z2024-03-19T11:58:51ZExploring financial data analysis in high dimensionsGuo, Yifeng郭屹峰http://hdl.handle.net/10722/3415312024-03-18T09:55:41Z2024-01-01T00:00:00ZTitle: Exploring financial data analysis in high dimensions
Authors: Guo, Yifeng; 郭屹峰
Abstract: Financial data analysis is crucial for making informative decisions based on accurate and reliable forecasting and structural analysis. With the advancement of modern technology, researchers can collect a vast amount of financial data and pursue more complex structures within the data, leading to emerging interests in high-dimensional modeling.
The first part of this thesis models realized volatilities for high-frequency data. A multilinear low-rank heterogeneous autoregressive (HAR) model is proposed by using tensor techniques, where a data-driven method is adopted to automatically select the heterogeneous components. In addition, HAR-It\^o models are introduced to interpret the corresponding high-frequency dynamics, as well as those of other HAR-type models. Moreover, non-asymptotic properties of the high-dimensional HAR modeling are established, and a projected gradient descent algorithm with theoretical justifications is suggested to search for estimates. Theoretical and computational properties of the proposed method are verified by simulation experiments and real data analysis.
Secondly, this thesis proposes a sample-average approximation-based portfolio strategy to tackle the existing difficulties in portfolio management with cardinality constraints. Our strategy bypasses the estimation of mean and covariance, the Great Wall in high-dimensional scenarios. Empirical results on S\&P 500 and Russell 2000 show that an appropriate number of carefully chosen assets leads to better out-of-sample mean-variance efficiency.
The third part of this thesis extends the results of the second part to risk management. Focusing on Conditional Value-at-Risk (CVaR) portfolio optimization problems in the expanding global markets, this thesis analyzes sparsity-induced portfolio strategies. The equivalence of the regularized CVaR minimization problem and its $l_0$-constrained counterpart in the norm ball is analyzed and the non-asymptotic error rate is established. The numerical experiments demonstrate the risk aversion preference and robustness of the proposed portfolio strategy on both synthetic data and the S\&P 500 dataset.2024-01-01T00:00:00ZRandom matrix theory and its applications in high-dimensional hypothesis testing problemsMei, Tianxing梅天星http://hdl.handle.net/10722/3366372024-02-26T08:30:54Z2023-01-01T00:00:00ZTitle: Random matrix theory and its applications in high-dimensional hypothesis testing problems
Authors: Mei, Tianxing; 梅天星
Abstract: Random matrix theory is one of the important parts of modern probability theory and has been applied widely in high-dimensional statistical analysis. This thesis consists of two parts: one focuses on characterizing the limiting singular value distribution of a large data matrix with independent columns; and the other concerns about an application of random matrix theory to the problem of testing hypotheses on a growing number of large covariance matrices.
In the first part, we analyze the singular values of a large $p\times n$ data matrix $\mathbf{X}_n=(\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn})$, where the columns $\{\mathbf{x}_{nj}\}$ are independent $p$-dimensional vectors, possibly with different distributions. Assuming that the covariance matrices $\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj})$ of the column vectors can be asymptotically simultaneously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of $\mathbf{X}_n$ when both dimensions $p$ and $n$ grow to infinity in comparable magnitudes. Our matrix model includes and goes beyond many types of sample covariance matrices in existing work, such as weighted sample covariance matrices, Gram matrices, and sample covariance matrices of a linear time series model. Furthermore, three applications of our general approach are developed.
First, we obtain the existence and uniqueness of the LSD for realized covariance matrices of a multi-dimensional diffusion process with anisotropic time-varying co-volatility. Second, we derive the LSD for singular values of data matrices from a recent matrix-valued auto-regressive model. Finally, we also obtain the LSD for singular values of data matrices from a generalized finite mixture model.
In the second part, we consider the hypothesis testing problem involving a large number of $q$ covariance matrices of dimension $p$ under a limiting scheme where $p$, $q$, and the sample sizes from the $q$ populations grow to infinity in a proper manner. Under this setting, we propose procedures for testing (a) the equality hypothesis, (b) the proportionality hypothesis, and (c) the general hypothesis on the dimension of the linear span of $q$ covariance matrices. The proposed test statistics are shown to be asymptotically normal. Simulation results show that finite sample properties of the test procedures are satisfactory under both the null and alternatives. As an application, we apply our test procedures to a matrix-valued transposable gene data, the Mouse Aging Project, and derive some new insights about its covariance structures. Empirical analysis of datasets from the 1000 Genomes Project (phase 3) is also conducted.2023-01-01T00:00:00ZStudy of optimization problems in the insurance industryHe, Wanting贺莞婷http://hdl.handle.net/10722/3366162024-02-26T08:30:43Z2023-01-01T00:00:00ZTitle: Study of optimization problems in the insurance industry
Authors: He, Wanting; 贺莞婷
Abstract: In the insurance industry, optimization problems play a pivotal role in vari- ous aspects of business operations, such as risk management, pricing, and claim settlement. As competition in the market intensifies, insurance companies are increasingly turning to advanced analytical techniques and mathematical mod- eling approaches to optimize their strategies and maximize profitability. This thesis investigates the most significant and emerging optimization problems in the insurance industry, with a focus on cutting-edge techniques and method- ologies that enhance efficiency and effectiveness in these areas. An extensive review of the literature is presented, the latest trends and innovations in op- timization techniques are discussed, and novel solutions to some of the most pressing challenges faced by (re)insurance providers are proposed. Through case studies and empirical analyses, this research demonstrates the value of adopting advanced optimization methods and tools in the insurance industry, providing valuable insights for both academics and practitioners.
The first part deals with the multi-constrained Pareto-optimal reinsurance problems based on general distortion risk measures, which become technically challenging and have only been solved usingad hoc methods for certain special cases. In this research, the method developed by Lo (2017) is extended by proposing a generalized Neyman-Pearson framework to identify the optimal forms of the solutions. Then a dual formulation is developed, which shows that the infinite-dimensional constrained optimization problems can be reduced to finite-dimensional unconstrained ones. With the support of the Nelder-Mead algorithm, the optimal solutions can be obtained efficiently. To illustrate the versatility of our approach, several detailed numerical examples are provided, many of which were only partially resolved in the literature.
In the second part, an evolutionary game model is developed based on a cost-benefit analysis of the insurer, the managing general agency (MGA)/ broker, and the consumer in a digitalization setting. Our findings suggest that an MGA partnership could lead to higher social welfare, considering the underwriting cost and mismatching cost. The MGA partnership is particularly beneficial for small- and medium-sized insurers, and can also expand consumer demand. The evolutionary stable state of the inter-mediated insurance market is determined by the relationship between the consultation fee and its critical value. Finally, the empirical data and simulation results are consistent with the predictions of our model.2023-01-01T00:00:00ZStrengthening cross-interaction learning for vision networksFang, Yanwen方艷雯http://hdl.handle.net/10722/3359462023-12-29T04:05:04Z2023-01-01T00:00:00ZTitle: Strengthening cross-interaction learning for vision networks
Authors: Fang, Yanwen; 方艷雯
Abstract: In recent years, the field of computer vision has grown astoundingly due to the notable success achieved by various vision networks such as CNNs, vision Transformers and so on. A vision network is generally designed to learn various interactions between objects for different tasks. For example, learning the temporal interaction between different time steps is key to modeling time series data for prediction task. This thesis studies strengthening cross-interaction learning for vision networks in three aspects: cross-layer interaction in backbone models, intraperiod and intratrend temporal interactions in human motion, and person-person interaction in multi-person poses. To achieve these objectives, the thesis proposes three approaches, all of which enhance the representation power of the networks with notable performances.
Firstly, a new cross-layer attention mechanism, called multi-head recurrent layer attention (MRLA), is proposed to strengthen layerwise interactions by retrieving query-related information from previous layers. To reduce the quadratic computation cost inherited from the vanilla attention, a light-weighted version of MRLA with linear complexity is further proposed to make cross-layer attention feasible to more deep networks. This thesis devises MRLA as a plug-and-play module which is compatible with two types of mainstream vision networks: CNNs and vision Transformers. Remarkable improvements brought by MRLA in image classification, object detection and instance segmentation tasks on benchmark datasets demonstrate its effectiveness, showing that MRLA can enrich the representation power of many state-of-the-art vision networks by linking the fine-grained features to the global ones.
Secondly, this thesis explores the intraperiod and intratrend interactions for human motion prediction. A new periodic-trend pose decomposition (PTPDecomp) block is proposed to decompose the hidden pose sequences into period and trend components for separately modeling the temporal dependencies within the period and trend. The PTPDecomp block cooperates with spatial GCNs and temporal GCNs, leading to an encoder-decoder framework called Periodic-Trend Enhanced GCN (PTE-GCN). The encoder or decoder progressively eliminates or refines the long-term trend pattern and focuses on modeling the period pattern, which facilitates learning the intricate temporal relationships entangled in pose sequences. Experiment results on three benchmark datasets demonstrate that PTE-GCN surpasses the state-of-the-art methods in both short-term and long-term predictions, especially for the periodic actions like walking in the long-term forecasting.
Lastly, this thesis studies the interactions between the highly interacted persons' motion trajectories in the task of multi-person extreme motion prediction. A novel cross-query attention (XQA) module is proposed to bilaterally learn the cross-dependencies between the two pose sequences. Additionally, a proxy unit is introduced to bridge the involved persons, which cooperates with the XQA module and subtly controls the bidirectional information flows. These designs are then integrated into a Transformer-based architecture and an end-to-end framework called proxy-bridged game Transformer (PGformer) is devised for multi-person motion prediction. Its effectiveness has been evaluated on the challenging ExPI dataset, and PGformer consistently outperforms the state-of-the-art methods in both short-term and long-term predictions. Besides, PGformer can also be well-compatible with the weakly interacted CMU-Mocap and MuPoTS-3D datasets and achieve encouraging results.2023-01-01T00:00:00Z