Table of Contents

Learn GPyOpt

Bayesian Modeling

optimize_restarts

GPyOpt.core.task.SingleObjective: the objective function should take 2-dimensional numpy arrays as input and outputs. Each row should contain a location (in the case of the inputs) or a function evaluation (in the case of the outputs).

It seems in GPyOpt.methods.ModularBayesianOptimization(model, feasible_region, objective, acquisition, evaluator, initial_design), only evaluator is actually used for BO calculation, but acquisition is only used for printing information or plotting acquisition function.

https://github.com/SheffieldML/GPyOpt/blob/e3f31301a11c382789c6de1066bf0437b4f1660b/manual/GPyOpt_context.ipynb

In this notebook we are going to see how to used GPyOpt to solve optimizaiton problems in which certain varaibles are fixed during the optimization phase. These are called context variables. For details see:
Krause, A. & Ong, C. S. Contextual gaussian process bandit optimization Advances in Neural Information Processing Systems (NIPS), 2011, 2447-2455

代码结构研究

问题

  1. self.modular_optimization好像只出现在了BayesianOptimizationModularBayesianOptimization中,搜索结果没有发现其它地方调用了这两个参数
  2. ContextManager和constraints同时存在时,indexing好像会发生变化。应该查看optimization.optimizer.OptimizationWithContext()找实现的细节。
  3. 什么是所谓的anchor_points
  4. DuplicateManager是怎么发挥作用的?是sequential BO里面是否需要用它?
  5. AcquisitionLCB, AcquisitionOptimizer, ModularBayesianOptimization中都需要space,这几个是否冲突?
    • 看下面的记录,实际上,constraints只在

细节记录

ContextManager会记录固定变量的index and value,然后通过optimization.optimizer.OptimizationWithContext()把original objective function的变量中固定的部分去掉,实际输入到AF优化器中的,只的non-context variables。 真实的输入向量可以通过ContextManager._expand_vector()来恢复。

最初初始化GPyOpt.Design_space时设置的参数constraints, 会随着space参数被传递到GPyOpt.optimization.AcquisitionOptimizer, 真正被调用起作用是在GPyOpt.experiment_design.random_design.RandomDesign.get_samples_with_constraints()中, 最终执行设置的constaints是在GPyOpt.core.task.space.Design_space.indicator_constraints()中(仅此一处,参考这里的搜索结果), 此时的x已经是经过ContextManager转化后不包括context variable的向量(这里关键的一步发生在GPyOpt.optimization.anchor_points_generator.AnchorPointsGenerator.get()重新定义内部变量space的过程中,line 23–24),所以相应的constraints中的index应该是对应这里的x的,而不原始的GP的输入变量。
(为什么要这样设计?)

(使用RandomDesign是为什么?不应该是maximize AF吗?什么关系?)
【我现在使用的是ObjectiveAnchorPointsGenerator,默认了生成anchor points时使用random initial design。 在实际计算时,会随机在design space without constraints中生成1000(默认值)个点,然后再用constraint从中排除掉不满足约束的,然后再循环直到生成超过1000个点,最终从这1000个中选取5(默认值)个achor points,作为直到进行local optimization的intial values,然后选择从这5个点出发得到的最小值,作为optimize AF的最终结果,即最终的self.suggested_sample。】

调用bo.run_optimization()如果max_iter大于1的话,也是可以实现不更新GP的,但是每一步的context variable没有办法更新

最终都要调用BO.run_optimization()来运算。 run_optimization不返回任何值,所有的计算值全部存储在BO的相应变量中。

self.max_iter, self.max_time 只作用于BO的iteration计时,不会提前结束GP的训练,GP训练时间超过max_time也会继续直到训练完成。
self.Y_best 用来画图的,没用
self.X, self.Y 由输入值初始化,会不断append新的self.Y_new
self.fx_opt self.Y的minimum。注意这里是objective function value,所以如果本身是带随机性的,这个值就是带随机性的,和直接由GP算出的值不同。
self.x_opt 对应self.Y的minimum
self.model_update_interval 控制多少个new samples后更新GP。怎么避免每次采样同样的地方?
self.de_duplication 这个是否可以和model_update_interval配合防止采样相同的地方?
self.suggested_sample self._compute_next_evaluations()计算出的next batch of points or point (if batch size is 1)

不同的 Acquisition Functions

已核实,AF的公式是如 GPyOpt Tutorial: 2.2 Acquisition Function 中所示,但是实际代码中是进行取负号后的 minimization:

Learn GPy

The GPy's document website is pretty much useless IMO. All are incomplete description of APIs with no insights on the overall structure.

Tutorials don't cover the whole capability of the package. Most tutorials are old.

However, the __init__.py in each module contains many helpful and insightful descriptions and introduction.

Tell from models/__init__.py, Classes of different models are organized in different files. Many files have only a single class, but the class name is different from filename.

https://github.com/SheffieldML/GPy/blob/devel/GPy/core/__init__.py

optimize_restarts: Perform random restarts of the model, and set the model to the best seen solution.

Development Branch

Devel的文档包含详细的信息

Learn Paramz

Paramz documents