Furusato Nozei (ふるさと納税)

Posted on 2019-10-28 | In Life in Japan |

What is Furusato Nozei

The Furusato Nozei Program, or Hometown Tax Donation Program, is a tax incentive scheme by the Japanese government to support the development of smaller, less funded municipalities.

Taxpayers can choose to donate to a city, prefecture, municipality or cause that they want to support, such as social and environmental programs and aid in times of disaster.

In return, taxpayers are exempted from a portion of their income and resident taxes and get to enjoy gifts such as wagyu, melons, sake and other premium local specialties delivered to them as a token of appreciation.

In brief, doing Furusato Nozei give you wonderful gifts※ and support the causes you believe in with your tax money at the same time.

※ Actually, the gifts cost 2,000 JPY, but the value of the gifts is far beyond this.

《From Paraphrase Database to Compositional Paraphrase Model and Back》

Posted on 2018-12-13 | In Papers |

arxiv
2015.06

之前关于词嵌入的理解：学习笔记——词嵌入（Word Embedding）

概述

释义检测（Paraphrase Detection）任务的目标是辨别两个结构和用词有所不同的句子是否拥有相同的含义。释义检测对多个 NLP 任务都很有用，例如 QA，语义分析（Semantic Parsing），文本蕴含关系（Textual Entailment）以及机器翻译。

本文是围绕 PPDB（Paraphrase Database，自动提取了数百万个释义的数据库，数据库中包含了很多具有相同释义的短语/单词对）来进行讨论的。

PPDB 的几个缺点：

覆盖率不够高，需要进行释义检测的两个短语必须都在 PPDB 数据库里
非参数释义模型（nonparametric paraphrase model），参数（短语对）数量根据数据集大小变化，在实际使用中可能会大到很难处理
置信度估计是特征的启发式组合，其质量尚不明确

本文的主要工作：

提出新的 PARAGRAM 词向量
提出使用 PPDB 来 embed 短语的一些方法
提出两个新的数据集，Annotated-PPDB 和 ML-Paraphrase

Kaggle: Quora Insincere Questions Classification

Posted on 2018-12-06 | In Competitions |

现在开始用用英文写一部分报告。

Introduction

Quora Insincere Questions Classification

Target: A binary classifier to identify insincere questions
Evaluation: F1 Score
Environment: Kernels only

《BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding》

Posted on 2018-11-01 | In Papers |

arxiv
Google AI Language
2018.10

主要工作

提出 BERT (Bidirectional Encoder Representations from Transformers)
提出新的 pre-training objective MLM (masked language model)

Transformer 来源于《Attention is all you need》(论文笔记)

使用 keras 复现 MemN2N

Posted on 2018-05-16 | In 论文复现 |

keras 复现 memory networks 系列

概述

参考论文：《End-To-End Memory Networks》

数据集：bAbI-tasks

最终源码：keras-MemN2N（GitHub）

前期工作：

Keras 报错：An operation has `None` for gradient.

Posted on 2018-05-13 | In 豆知识 |

结论：自定义层的时候不要在build里定义不会在call里调用的 trainable 变量

Keras get_weights() 的错误用法

Posted on 2018-05-12 | In 豆知识 |

结论：get_weights() 放到模型里不会随迭代更新

关于 Keras 的 Embedding Initializer

Posted on 2018-04-24 | In 豆知识 |

结论：我的脑子坏掉了orz

学习笔记——Memory Networks 相关论文总结

Posted on 2018-04-18 | In 学习笔记 |

就读过的 Memory Networks 相关论文做一个简单的汇总。

相关网络结构

Memory Networks / Memory Neural Networks（MemNNs）
End-to-End Memory Networks（MemN2N）
Key-Value Memory Networks（KV-MemNNs）
Dynamic Memory Networks（DMN）
Imporved Dynamic Memory Networks（DMN+）

学习笔记——LightGBM（Light Gradient Boosting Machine）

Posted on 2018-04-03 | In 学习笔记 |

前篇参考：学习笔记——集成学习（Ensemble Learning）

相关论文：LightGBM A Highly Efficient Gradient Boosting

Github：LightGBM

rocuku

I know nothing except the fact of my ignorance

GitHub E-Mail