DSpace Collection:

DSpace Collection: http://hdl.handle.net/10722/38663 2024-09-25T18:54:59Z 2024-09-25T18:54:59Z Generalization analysis and regularization in over-parameterized models Meng, Xuran 孟徐然 http://hdl.handle.net/10722/345401 2024-08-26T08:59:32Z 2024-01-01T00:00:00Z

Title: Generalization analysis and regularization in over-parameterized models Authors: Meng, Xuran; 孟徐然 Abstract: We study the success of over parameterized models in both regression and classification tasks. In the regression task, we uncover the phenomenon of multiple descent in random feature models, where the test accuracy follows a curve with multiple descents as the number of model parameters increases. In the classification task, we theoretically establish the capability of two-layer ReLU convolutional neural networks to learn complex XOR data. We find that these networks can achieve the Bayes optimal test accuracy when the data signal-to-noise ratio (SNR) is high. Through our theoretical investigations, we discover that benign overfitting only occurs when the data set has a high SNR. Models trained on low SNR data consistently exhibit poor test performance, indicating harmful overfitting of the training data set. We also explore two regularization techniques which can address the issue of harmful overfitting in low SNR data sets for over parameterized models. Firstly, we investigate gradient regularization and its role during the training process. Our theoretical analysis reveals that gradient regularization can effectively suppress the memorization of noise within the model. Consequently, the models with gradient regularization exhibit improved performance in signal learning compared to models without this regularization technique. Secondly, we explore the use of early stopping as a regularization technique. By observing the spectra of weight matrices during the training procedure, researchers identify deviations from the Marchenko-Pastur law. We found that these deviations indicate the presence of sufficient training information or potential issues. As a result, we propose a spectra criterion that can guide the early stopping process during training. Overall, this thesis highlights our investigations into the success of over parameterized models in various learning tasks. We provide insights into the conditions under which these models perform well, and investigate several regularization techniques which can mitigate the harmful overfitting.

2024-01-01T00:00:00Z Laguerre path-dependent volatility model Chiu, Eddie W. K 趙偉棋 http://hdl.handle.net/10722/344399 2024-07-30T05:00:37Z 2023-01-01T00:00:00Z

Title: Laguerre path-dependent volatility model Authors: Chiu, Eddie W. K; 趙偉棋 Abstract: There has been much effort devoted to designing and engineering new and improved stochastic models for option pricing. Much attention in both the academia and the industry has been drawn to stochastic models that model the asset dynamics without regard to the historical path of the asset price leading up to the time when option pricing is performed. A path-dependent volatility model is a stochastic model where the volatility dynamics are driven by the whole path of the asset price. The use of a path-dependent volatility model allows for the incorporation of the historical path of the asset price in modeling the volatility dynamics. It is perhaps intuitive that the price of an asset is driven by market factors which may not be adequately captured by financial variables measured at one instant; instead, there may be information regarding such factors that can be extracted from the historical path of the asset price. For instance, an upward sloping path in the recent history may signal a positive outlook, or even a market bloom; a downward sloping path may signify market distress; a trough or a crest in the recent times may signify a reversion of market conditions, to name a few - the apparent or hidden patterns in the historical path of asset prices may hint at how the market will continue to evolve. Therefore, the path-dependent volatility model is a natural extension of the prevailing stochastic models for option pricing. The mainstream path-dependent volatility models take the approach of inventing path-dependent state variables that encode the path-dependent information of the historical path of asset prices and inventing a volatility function that extracts the path-dependent information from the path-dependent state variables; often the inventions are based on intuition or ad-hoc analysis. We propose an innovative formulation of a path-dependent volatility model called the Laguerre Path-Dependent Volatility (LPDV) model. We apply series expansion to a historical price path with Laguerre polynomials, turning a path into a sequence of coefficients of the series. This sequence can be interpreted as a representation of the path, and we select a finite subset of the sequence as the path-dependent state variables with the property that they approximately represent the path. Then, we choose a volatility function that is both sufficiently flexible and theoretically connected to the Laguerre series expansion. The theoretical analysis is supported by a sound theoretical framework that we develop. We also provide a detailed account on model calibration. We discuss comprehensively the considerations and challenges that one might face in model calibration. In addition, we propose an innovative calibration procedure that is uncommon in the literature but is suitable for the LPDV model. Finally, we conduct a numerical experiment where we test the LPDV model in a simulated controled setting. We discuss the details in various aspects of the implementation of both model calibration and option pricing. We provide example cases to study the performance of the model, paying attention to how path-dependent volatility comes into play.

2023-01-01T00:00:00Z Dimension reduction via projection and its applications Si, Yuefeng 斯越峰 http://hdl.handle.net/10722/343765 2024-06-06T01:04:49Z 2024-01-01T00:00:00Z

Title: Dimension reduction via projection and its applications Authors: Si, Yuefeng; 斯越峰 Abstract: In the era of big data, high-dimensional data sets are widely encountered in the fields of genomics, time series and machine learning. Projection technique from dimension reduction is efficient to compress number of features and conduct statistical interpretation. With the advancement of modern technology, parametric tensor decomposition and nonparametric angle-based distance are popular projection tools. In the first part of the thesis, a newly proposed tensor train (TT) decomposition is used to compress parametric subspace of tensor regression. Many existing models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. We propose a modified TT decomposition and then applies it to tensor regression such that a nice statistical interpretation can be obtained. The new tensor regression can well match the data with hierarchical structures. More importantly, the new tensor regression can be easily applied to the case with higher order tensors since TT decomposition can compress the coefficient tensors much more efficiently. The methodology is also extended to tensor autoregression for time series data, and nonasymptotic properties are derived for ordinary least squares estimations of both tensor regression and autoregression. A new algorithm is introduced to search for estimators, and its theoretical justification is also discussed. Theoretical and computational properties of the proposed methodology are verified by simulation studies, and the advantages over existing methods are illustrated by two real examples. Secondly, a novel projection mean variance (PMV) measure from nonparametric model is used to test the multi-sample hypothesis of equal distributions for univariate or multivariate responses. The proposed PMV measure generalizes the mean variance index using projection technique. The PMV measure yields an analogous variance component decomposition. Using this decomposition, an ANOVA F statistic is derived to test the multi-sample problem. The proposed test is statistically consistent against general alternatives and robust to heavy-tailed data. The test is free of tuning parameters and does not require moment conditions on the response. Our simulation results demonstrate that the PMV test has higher power than classical Wilks-type methods and DISCO test, especially when the dimension of the response is relatively large or the moment conditions required by the DISCO test are violated. We further illustrate our method using empirical analyses of two real data sets. Lastly, projection quantile correlation (PQC) is proposed to detect quantile dependence between a response and multivariate predictors at a given quantile level. We then use the measure to select grouped predictors that contribute to conditional quantile of the response for high-dimensional data with group structures. Sure independent screening property is established for the group screening method. We illustrate the finite-sample performance of the proposed method through simulations and an application to a data set.

2024-01-01T00:00:00Z Study of survival models with infinite parameter space and its application in network analysis Zhou, Yunpeng 周云鵬 http://hdl.handle.net/10722/342895 2024-05-07T01:22:15Z 2023-01-01T00:00:00Z

Title: Study of survival models with infinite parameter space and its application in network analysis Authors: Zhou, Yunpeng; 周云鵬 Abstract: High-dimensional data is commonly observed in survival analysis which requires the use of survival models with infinite parameter space. For example, genomic type of data such as DNA micro-array data is frequently used for risk prediction but the number of parameters p are always larger than the number of observations n. In dataset with a large number of covariates, the parameter space always exhibits sparsity or homogeneity. Therefore, it is crucial to developing methods for estimating the coefficients accurately and identifying the significant parameters affecting the hazard rate. Penalized regression like LASSO is a usual choice for variable selection in real applications. In order to improve the estimation accuracy, two algorithms are proposed to approximate the solution to the l0 penalized regression in this thesis. Both methods perform well in selecting the subset of parameters, especially in terms of controlling the false positive rate. In addition, since the hazard rate of a survival model describes the frequency of event occurrence, it is natural to extend its application to the area of network analysis for describing the communication frequency between individuals. Recurrent network event data is most relevant for studying phenomena that involve repeated interactions between subjects over time, such as communication networks or social networks. The analysis of such data is hence more complex than that of static network data as one needs to analyze the effects of network structure and temporal dynamics simultaneously. Here we propose new approaches that utilize two separate sets of parameters to account for degree heterogeneity and homophily, respectively. Meanwhile, the baseline intensity function is left completely unspecified to flexibly capture the time-varying pattern of the underlying process. Under a semi-parametric model, we apply the fused smoothly clipped absolute deviation (SCAD) penalty to group identification. To further incorporate more dynamic structures of the network, we then propose the fully non-parametric model based on the counting process with time varying parameters. Simulation studies are carried out to verify the consistency and asymptotic properties of the models of study and evaluate their finite-sample performance. Our models are also applied to different network datasets for illustration.

2023-01-01T00:00:00Z